Advanced 

Engineering 

Mathematics 


C. R. WYLIE, JR. 

Professor and Chairman, Department of Mathematics, 
University of Utah 


THIRD EDITION 


INTERNATIONAL STUDENT EDITION 


McGRAW-HILL BOOK COMPANY 

New York 
St. Louis 
San Francisco 
Toronto 



SIO 





Actvoncad Englneerlnfl Mathematics 

INTERNATIONAL STUDENT EDITION 

Exclusive rights by Kogakusha Co., Ltd. for manufacture and 
export from Japan. This book cannot be re-exported from the 
country to which it is consigned by Kdgakusha Co., Ltd. or by 
McGraw-Hill Book Company or any of its subsidiaries. 




TOKYO, JAPAN 



The first edition of this book was written with the announced 
purpose of providing an introduction to those branches of mathe- 
matics with which the average analytical engineer or physicist 
should be reasonably familiar in order to carry on his own work 
effectively and keep abreast of current developments in his field. 
In the present edition, as in the second, although the material 
has been completely rewritten, the objective remains the same, 
and the various additions, deletions, and refinements have been 
made only because they seemed to contribute to the realization 
of this goal. 

Because ordinary differential equations are probably the 
most immediately useful part of postcalculus mathematics for the 
student of applied science and because the techniques of solving 
simple ordinary differential equations stem naturally from the 
techniques of calculus, the chapter on determinants and matrices 
with which the second edition began has been made a later chap- 
ter, and the book now begins with a chapter on ordinary differ- 
ential equations of the first order. This is followed by two other 
chapters on differential equations which develop the subject as 
far as the solution of systems of simultaneous linear equations 
with constant coefficients. Following these is a chapter on finite 
differences containing not only the usual applications to interpo- 
lation, numerical differentiation and integration, and the step- 
by-step solution of differential equations, but also a section on 
linear difference equations with constant coefficients paralleling 
closely the preceding development for differential equations. 
This chapter also includes a discussion of curve fitting and the 
smoothing of data, as well as the method of least squares and the 
related topic of orthogonal polynomials. One innovation in the 
present edition is the introduction of the Bunge-Kutta method in 
addition to Milne’s method for the step-by-step solution of differ- 
ential equations. It is hoped that the material in this chapter will 



a more extensive course in numerical analysis may be based. 
The fifth chapter is devoted to the application of the foregoing 
theory to mechanical and electrical systems, and, as in the first 
two editions, the mathematical identity of the two fields is 
emphasized. However, a detailed discussion of the construction 
of electromechanical analogies is no longer included. The next 
two chapters deal, respectively, with Fourier series and integrals 
and with the Laplace transform, very much as did the correspond- 
ing chapters in the second edition. The chapters on separable 
partial differential equations and Bessel functions follow closely 
the development in the second edition, although many of the 
examples are new. 

The material on determinants and matrices which formed 
Chapter 1 in the second edition now appears substantially 
expanded as two chapters which follow the material on differ- 
ential equations and related topics. Next comes the chapter on 
vector analysis which, except for minor changes, is essentially 
the same as in the second edition. Following the chapter on vector 
analysis is a new chapter devoted to an introduction to tensor 
analysis. The last four chapters cover the theory of functions of a 
complex variable very much as did the corresponding chapters 
in the second edition. 

The book as presently organized falls naturally into three 
major subdivisions. The first nine chapters constitute a reason- 
ably self-contained treatment of ordinary and partial differential 
equations and their applications. The next four chapters cover 
the related areas of linear algebra, vector analysis, and tensor 
analysis; and the last four chapters cover the elementary theory 
and applications of functions of a complex variable. With this 
organization, the book, which contains enough material for a 
two-year postcalculus course in applied mathematics, is well 
adapted to use as a text for any of several shorter courses. 

In the third edition, as in the first two, every effort has been 
made to keep the presentation detailed and clear while at the same 
time maintaining acceptable standards of precision and accuracy. 
To achieve this, more than the usual number of worked examples 
and carefully drawn figures have been included, and in every 
development there has been a conscious attempt to make the 
transitions from step to step so clear that a student with no more 
than a good background in calculus should seldom be held up more 
than momentarily. Over 400 new exercises of varying degrees of 
difficulty have been added to the problems already in the second 
edition. These range from formal problems of a purely routine 
nature to practical applications of considerable complexity. 
Hints are included in many of the exercises, and answers to the 
odd-numbered ones are given at the end of the book. As in the 
first two editions, words and phrases defined in the body of the 



a sign of emphasis. Theorems, corollaries, and formal definitions 
are set on wider lines than the main body of the text, and illus- 
trative examples are set in type of a different size. 

The indebtedness of the author to his colleagues, students, 
and former teachers is too great to catalog, and to all who have 
given help and encouragement in the preparation of this book, I 
can offer here only a most inadequate acknowledgment of my 
appreciation. In particular, I am deeply grateful to those users 
of this book who have been kind enough to write me their impres- 
sions and criticisms of the first two editions and their sugges- 
tions for an improved third edition. Finally, I must express my 
gratitude to my wife, Ellen, and to my secretary, Mrs. Jason 
Everts, who gave me invaluable assistance in proofreading the 
manuscript. 

C. R. WYLIE, JR. 




To the 
Student 


This book has been written to help you in your development as an 
applied scientist, whether engineer, physicist, chemist, or mathe- 
matician. It contains material which you will find of great use, 
not only in the technical courses you have yet to take, but also 
in your profession after graduation as long as you deal with the 
analytical aspects of your field. 

I have tried to write a book which you will find not only 
useful but also easy to study from, at least as easy as a book on 
advanced mathematics can be. There is a good deal of theory in 
it, for it is the theoretical portion of a subject which is the basis 
for the nonroutine applications of tomorrow. But nowhere will 
you find theory for its own sake, interesting and legitimate as 
this may be to a pure mathematician. Our theoretical discussions 
are designed to illuminate principles, to indicate generalizations, 
to establish limits within which a given technique may or may 
not safely be used, or to point out pitfalls into which one might 
otherwise stumble. On the other hand, there are many applica- 
tions illustrating, with the material at hand, the usual steps in 
the solution of a physical problem: formulation, manipulation, 
and interpretation. These examples are, without exception, care- 
fully set up and completely worked, with all but the simplest 
steps included. Study them carefully, with paper and pencil at 
hand, for they are an integral part of the text. If you do this you 
should find the exercises, though challenging, still within your 
ability to work. 

There are two minor points of notation which, when appreci- 
ated, should add to the ease with which you can read this book. 
In the first place, concepts and terms defined in formal definitions 
and terms defined informally in the body of the text are always 
indicated by the use of boldface type. Second, italic type is used 



which a person would place upon key words when speaking. One 
final suggestion to you in your study of this book is that you read 
each section through for the main ideas before you concentrate on 
filling in any of the details. You will probably be surprised at how 
many times a detail which seems to hold you up in one paragraph 
is explained in the next as the discussion unfolds. 

Because this book is a long one and contains material suitable 
for various courses, your teacher may begin with any of a number 
of chapters. However, the overall structure of the book is the fol- 
lowing: The first nine chapters are devoted to the general theme 
of ordinary and partial differential equations and related topics. 
Here you will find basic analytical techniques for solving the 
equations in which physical problems must be formulated when 
continuously changing quantities are involved. Chapters 10 
through 13 deal with the somewhat related topics of linear alge- 
bra, vector analysis, and an introduction to generalized coordi- 
nates and tensor analysis. Finally, Chapters 14 through 17 
provide an introduction to the theory and applications of func- 
tions of a complex variable. (Chapter 4, in particular, is worthy 
of note because it provides an introduction to numerical analysis, 
the modern field which deals with techniques for obtaining 
numerical answers to problems too complicated to be solved by 
exact analytic methods.) 

It has been gratifying to me to receive from time to time 
letters from students who have used this book, giving me their 
reactions to it, pointing out errors and misprints in it, and offering 
suggestions for its improvement. Should you be inclined to do so, 
I should be happy to hear from you also. And now good luck and 
every success. 


C. R. WYLIE, JR. 



Contents 



Preface v 

To the Student ix 

chapter 1 

Ordinary differential equations of the first order 1 

1.1 Introduction 1 

1.2 Fundamental Definitions 2 

1.3 Separable First-order Equations 8 

1 .4 Homogeneous First-order Equations 1 1 

1.5 Exact First-order Equations 14 

1.6 Linear First-order Equations 19 

1.7 Applications of First-order Differential Equations 21 


chapter 2 

Linear differential equations with 


constant coefficients, v 30 

2.1 The General Linear Second-order Equation 30 

2.2 The Homogeneous Linear Equation with Constant Coefficients 36 

2.3 The Nonhomogeneous Equation ' 42 

2.4 Particular Integrals by the Method of Variation of Parameters 49 

2.5 Equations of Higher Order 52 

2.6 Applications 55 

chapter 3 

Simultaneous linear differential equations 66 

3.1 Introduction 66 

3.2 The Reduction of a System to a Single Equation 67 


3.3 Complementary Functions and Particular Integrals for Systems 



chapter 4 

life differences 

4.1 The Differences of a Function 79 

4.2 Interpolation Formulas 90 

4.3 Numerical Integration and Differentiation 99 

4.4 The Numerical Solution of Differential Equations 108 

4.5 Difference Equations 1 17 

4.6 The Method of Least Squares 1 26 

chapter 5 

chaniccsl and electrical circuits 1 44 

5.1 Introduction 144 

5.2 Systems with One Degree of Freedom 1 44 

5.3 The Translational-mechanical System 151 

5.4 The Series-electrical Circuit 165 

5.5 Systems with Several Degrees of Freedom 171 

chapter 6 

rier series and integrals 181 

6.1 Introduction 181 

6.2 The Euler Coefficients 182 

6.3 Half-range Expansions 1 89 

6.4 Alternative Forms of Fourier Series 196 

6.5 Applications 200 

6.6 Harmonic Analysis 206 

6.7 The Fourier Integral as the Limit of a Fourier Series 21 1 

6.8 From the Fourier Integral to the Laplace Transform 222 

chapter 7 

Laplace transformation 226 

7.1 Theoretical Preliminaries 226 

7.2 The General Method 232 

7.3 The Transforms of Special Functions 237 

7.4 Further General Theorems 242 

7.5 The Heaviside Expansion Theorems 255 

7.6 Transforms of Periodic Functions 260 

7.7 Convolution and the Duhamel Formulas 270 

chapter 8 l , 

ial differential equafiorjs^ 282 

8.1 Introduction 282 

8.2 The Derivation of Equations 282 

8.3 The D’Alembert Solution of the Wave Equation 294 

8.4 Separation of Variables 302 

8.5 Orthogonal Functions and the General Expansion Problem 31 1 



chapter 9 

Bessel functions and Legendre polynomials 345 

9.1 Theoretical Preliminaries 345 

9.2 The Series Solution of Bessel’s Equation 351 

9.3 Modified Bessel Functions 357 

9.4 Equations Reducible to Bessel’s Equation 363 

9.5 Identities for the Bessel Functions 365 

9.6 Orthogonality of the Bessel Functions 372 

9.7 Applications of Bessel Functions 377 

9.8 Legendre Polynomials 388 

chapter 10^, > 

Determinants and matrices v 400 

10.1 Determinants 400 

10.2 Elementary Properties of Matrices 415 

10.3 Adjoints and Inverses 429 

10.4 Rank and the Equivalence of Matrices 437 

10.5 Systems of Linear Equations 444 

10.6 Matric Differential Equations 461 

chapter 1 1 

Further properties of matrices 466 

11.1 Quadratic Forms 466 

11.2 The Characteristic Equation of a Matrix 477 

1 1 .3 The Transformation of Matrices 492 

11.4 Functions of a Square Matrix 505 

11.5 The Cayley-Hamilton Theorem 517 

11.6 Infinite Series of Matrices 525 

chapter 12 

Vector analysis x 532 

12.1 The Algebra of Vectors 532 

12.2 Vector Functions of One Variable 545 

1 2.3 The Operator V 550 

1 2.4 Line, Surface, and Volume Integrals 559 

1 2.5 Integral Theorems 572 

1 2.6 Further Applications 585 

chapter 13 

Tensor analysis^' 'V 595 

13.1 Introduction 595 

13.2 Oblique Coordinates 595 

13.3 Generalized Coordinates 605 

13.4 Tensors 619 



chapter 14 

alyfic functions of a complex variable 633 

14.1 Introduction 633 

14.2 Algebraic Preliminaries 633 

14.3 The Geometric Representation of Complex Numbers 636 

14.4 Absolute Values 641 

1 4.5 Functions of a Complex Variable 644 

14.6 Analytic Functions 650 

14.7 The Elementary Functions of g 656 

14.8 Integration in the Complex Plane 663 

chapter 15 

mite series in the complex plane 676 

15.1 Series of Complex Terms 676 

15.2 Taylor’s Expansion 686 

15.3 Laurent’s Expansion 692 

chapter 16 

theory of residues 699 

16.1 The Residue Theorem 699 

1 6.2 The Evaluation of Real Definite Integrals 704 

16.3 The Complex Inversion Integral 711 

16.4 Stability Criteria 716 

chapter 17 

formal mapping 729 

17.1 The Geometrical Representation of Functions of z 729 

17.2 Conformal Mapping 732 

17.3 The Bilinear Transformation 737 

17.4 The Sehwarz-Christoffel Transformation 74a 

endix 755 

Graeffe’s Root-squaring Process 755 

wers to odd-numbered exercises 7^5 


801 



Ordinary 

Differential Equations 
of the First Order 


1.1 

Introduction An equation involving one or more derivatives of a function is 
called a differential equation. By a solution of a differential 
equation is meant a relation between the dependent and independ- 
ent variables which is free of derivatives and which, when sub- 
stituted into the given equation, reduces it to an identity. The 
study of the existence, nature, and determination of solutions of 
differential equations is of fundamental importance not only to 
the pure mathematician but also to anyone engaged in the 
mathematical analysis of natural phenomena. 

In general, a mathematician considers it a triumph if he is 
able to prove that a given differential equation possesses a solu- 
tion and if he can deduce a few of the more important properties 
of that solution. A physicist or engineer, on the other hand, is 
usually greatly disappointed if a specific expression for the solu- 
tion cannot be exhibited. The usual compromise is to find some 
practical procedure by means of which the required solution can 
be approximated with satisfactory accuracy. 

Not all differential equations are of such difficulty as to 
make this necessary, however, and there are several large and 
very important classes of equations for which solutions can 
readily be found. For instance, an equation such as 

s = f(x) 

is really a differential equation, and the integral 
y = ff(x) dx + c 

is a solution. More generally, the equation 
d n y 



cessive integrations. Except in name, the process of integration 
is actually an example of a process for solving differential 
equations. 

In this and the following two chapters we shall consider those 
differential equations which are next in difficulty after those 
which can be solved by direct integration. These equations form 
only a very small part of the class of all differential equations, 
and yet with a knowledge of them a scientist is equipped to handle 
a great variety of applications. To get so much for so little is 
indeed remarkable! 


1.2 

lamented definitions 

If the derivatives which appear in a differential equation are total 
derivatives, the equation is called an ordinary differential 
equation; if partial derivatives occur, the equation is called a 
partial differential equation. By the order of a differential equa- 
tion is meant the order of the highest derivative which appears in 
the equation. 

EXAMPLE 1 

equation x?y" + xy‘ + (** — i)y «** 0 is an ordinary differential equation of the second 
: connecting the dependent variable y with its first and second derivatives and with the 
pendent variable x. 

EXAMPLE 2 

. d*U, d*U d*U 

equation — - + 2 — - — - d = 0 is a partial differential equation of the fourth order. 

dx i dx 8 dy* dy* 

At present we shall be concerned exclusively with ordinary 
differential equations. 

An equation which is linear, that is, of the first degree, in the 
dependent variable and its derivatives is called a linear differential 
equation. All other equations are called nonlinear. In general, 
linear equations are much easier to solve than nonlinear ones, and 
most elementary applications involve linear equations. 

EXAMPLE 3 

equation y" + 4 xy' + 2y ~ cos x is sl linear equation of the second order. The presence 
le terms xy' and cos x does not alter the fact that the equation is linear, because, by defini- 
, linearity is determined solely by the way the dependent variable y and its derivatives 
r into combination among themselves. 

EXAMPLE 4 


EXAMPLE 5 


The equation y" + sin y — 0 is nonlinear because of the presence of sin y, which is a nonlinear 
function of y. 

As illustrated by the simple equation 


dy 

dx 6 

and its solution 
y = fe~ xt dx + c 


the solution of a differential equation may depend upon integrals 
which cannot be evaluated in terms of elementary functions. This 
example also illustrates the fact that a solution of a differential 
equation usually involves one or more arbitrary constants. 

A detailed treatment of the question of the maximum number 
of essential arbitrary constants that a general solution of a 
differential equation may contain or even of what is meant by 
essential constants is quite difficult.* For our purposes, if an 
expression contains n arbitrary constants we shall consider them 
essential if they cannot, through formal rearrangement of the 
expression, be replaced by any smaller number of constants. 
For example, 

(1) a cos 2 x + b sin 2 * + c cos 2x 

contains three arbitrary constants. However, since 
cos 2x = cos 2 x — sin 2 x 
the expression (1) can be written in the form 
a cos 2 x + b sin 2 x + c( cos 2 x — sin 2 x) = (a + c ) cos 2 x -f (& — c) sin 2 x 
— d cos 2 x e sin 2 x 

where d — a + c and e — b — c. The fact that the three arbitrary 
constants a, b, and c can be replaced by the two constants d and e 
shows that the former are not all essential. On the other hand, 
since cos 2 x and sin 2 x are linearly independent - }* (whereas cos 2 x, 
sin 2 x, and cos 2x are linearly dependent), it follows that there is 
no further rearrangement of the given expression that will permit 
d and e to be combined into and replaced by a single new arbi- 
trary constant. Hence d and e are essential. 

It is frequently the case (especially with linear equations) 
that a differential equation of order n possesses solutions contain- 
ing n essential arbitrary constants but none containing more. 


* See, for instance, JR.P. Agnew, “Differential Equations,” 2d ed., pp. 



ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 



(2) 


However, there are equations such as 


dy 

dx 


+ M = o 


(which has only the single solution y ~ 0) and 


dy 

dx 


+ 1=0 


(which has no solutions at all) which possess no solutions contain- 
ing any arbitrary constants. Moreover, there are also simple 
differential equations which possess solutions containing more 
essential parameters than the order of the equation. For instance, 
it is easy to verify that the arc of the family y — c:X~ (x S 0) 
(Fig. 1.1a) corresponding to any value of e* can be paired with the 
arc of the family y — c 2 r 2 ( x 2: 0) (Fig. 1.16) corresponding to 
any value of ca, to give a function which satisfies the differential 
equation 


xy' - 2 y 


for all values of x (Fig. 1.1c). A still more striking example of this 
sort appears in Exercise 30, where a first-order equation with a 
solution containing infinitely many essential parameters is given. 

As the foregoing suggests, it is difficult, if not impossible, to 
make statements valid for all differential equations. The theory 
of differential equations is essentially a body of theorems concern- 
ing particular classes of equations defined by such considerations 
as linearity, order, and continuity. Typical of these is the follow- 




Arcs of different parabolas of the family y — cx 1 pieced together to give solutions of the differ- 
ential equation xy' — 2 y. 


SEC. 1.4 


FUNDAMENTAL DEFINITIONS 


ing result,* which is of fundamental importance in the study of 
the equations we shall consider in this chapter, namely, equations 
of the first order: 


THEOREM 1 

Let (to,2/o) be a point of the ay-plane; let R be the rectangular region defined by 
the inequalities \x — t 0 | £ a, \y — y 0 \ b; let f(x,y) and f v (x,y) — be 

single-valued and continuous at all points of R; let M be a constant such that 
\f( x >y)\ < M at all points of R; and let h be the smaller of the numbers a and 
b/ M. Then, on the interval \x — * 0 | < h, there is a unique continuous function y 
which satisfies the equation y' = f(x,y) and which takes on the value y a when 
x = .To. 

It is instructive to reconsider Eq. (2) in the light of Theorem 
1. For this equation we have/(a,y) = 2 y/x, and, clearly, neither/ 
nor /„ exists when x = 0. Hence, it follows from Theorem 1 that, 
over an interval containing x — 0, neither the existence nor the 
uniqueness of a solution of Eq. (2) can be guaranteed. Actually, 
as our earlier discussion pointed out, Eq. (2) does have solutions 
which are valid for all values of x. However, as Fig. 1.1c illus- 
trates, over any interval which contains x — 0, the solution curve 
which passes through a given point (xo,yo), e.g., (1,1), is not 
unique. On the other hand, according to Theorem 1, over any 
interval which contains To but does not contain x — 0, the 
solution curve which passes through a given point (xo,yo) is 
unique. 

Almost all applications of differential equations involve 
equations which possess solutions containing at least one arbi- 
trary constant, and for such equations it is convenient to intro- 
duce the following definitions: A solution which contains at least 
one arbitrary constant is called a general solution. A solution 
obtained from a general solution by assigning particular values to 
the arbitrary constants which appear in it is called a particular 
solution. Solutions which cannot be obtained from any general 
solution by assigning specific values to its arbitrary constants are 
called singular solutions. If a general solution has the property 
that every solution of the differential equation can be obtained 
from it by assigning suitable values to its arbitrary constants, it 
is said to be a complete solution. A general solution can thus be 
thought- of as a description of some family of particular solutions, 
and a complete solution can be thought of as a description of the 
set of all solutions of the given equation. 

It is important to note that we speak of a general solution 
and a complete solution of a differential equation and not of the 
general solution and the complete solution. If an equation has a 


* See, for instance, M. Golomb and M. E. Shanks, “Elements of Ordinary 
Differential Equations,” 2d ed., pp. 63-78, McGraw-Hill Book Company, 
Now York, 1966. 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


general solution or a complete solution, it has many such solu- 
tions, and these may differ significantly in form. Moreover, in 
particular problems involving differential equations, the choice 
of which complete solution to use often has an important bearing 
on the ease with which the problem can be solved. 

EXAMPLE 6 

Verify that y = ae~ x + be** is a solution of the equation y" — y' — 2y = 0 for all values of 
the constants o and b. 

By differentiating ^ substituting into the differential equation as indicated, and then 
collecting terms on a and b, we obtain 

(ae~* + 46e s *) — (—ae~* + 2 be**) — 2 (ae~* + be**) 

= (e~* + e“* - 2 e~»)a + (4e a * - 2e-* - 2 e**)b 
■ 0 • a + 0 • b - 0 

for all values of a and b. Thus, y = ae~* + be** is a general solution of y" — y' — 2y ~~ 0. In 
fact, as we shall see in Sec. 2.2, it is a complete solution of this equation. 

It is interesting to note that, although yi <=* ae~* and t/ s = be ** also satisfy the equation 
yy" — (y'Y = 0 for all values of a and b, the sum 
y ** yi + y-t = ae~* + be** 

is not a solution of yy" — (y 1 )* - 0. In fact, differentiating, substituting, and simplifying, 
we have 

\ae~* + be**)(ae~* + 4 be**) - ( -ae~ * + 2 be**)* - Qabe* 
and this cannot vanish identically unless either a or b is zero; that is, unless the sum y consists 
of just one or the other of the two individual solutions. Roughly speaking, the reason for this 
difference in behavior is that the equation y" — y' — 2 y *= 0 is linear, whereas the equation 
w" — (y'Y =* 0 is nonlinear. More precisely, as we shall see in Theorem 1, See. 2.1, for linear 
equations in which y or one of its derivatives appears in every term, the sum of two solutions is 
also a solution, whereas, in general, the sum of two solutions of a nonlinear equation is not a 
solution. 

Occasionally it is necessary to determine a differential equa- 
tion of order n which has a given function containing n arbitrary 
constants as a general solution. This can be done (at least theo- 
retically) by differentiating the given expression n times and then 
eliminating the arbitrary constants by algebraic manipulation of 
the resulting equations. 

EXAMPLE 7 

If a and b are arbitary constants, find a second-order equation which has 
(3) y = ae* 4- b cos x 

as a general solution. 

By differentiating the given expression, we find 
W y' — ae * — b sin x 

(5) y" = ae* — b cos x 

Then, by adding and subtracting Eqs. (3) and (5), we obtain 

a = y + v” b „ V ~ V" 

2e* 2 cos x 


SEC. 1.2 


FUNDAMENTAL DEFINITIONS 


Substitution of these into Eq. (4) gives 


V + v" 

2e 



; — sin 

2 cos x 


and finally 

(6) (1 + tan x)y" — 2 y' + (1 — tan x)y — 0 


Although Eq. (6), except for its obvious multiples, is the only second-order differential 
equation having (3) as a general solution, it is by no means the only equation of which (3) is a 
general solution. For instance, if (5) is differentiated twice more we obtain 


y IV = ae x + i> cos x 

and by comparing this with (3) we can see that the given function also satisfies the very simple 
equation 

(7) y™ = y 

Since Eq. (7) is of the fourth order, it presumably possesses general solutions containing 
four arbitrary constants, and it is easy to verify that 
y «* ae x + b cos x + ce~ x + d sin x 
does in fact satisfy Eq. (7). 


EXERCISES 

Describe each of the following equations, giving its order and telling whether it is ordinary or 
partial and linear or nonlinear: 


1 

3 

5 



■ 0 
0 


2 y" + (a + b cos 2x)y =* 0 
4 y"’ + 6 y" + 4y' 4- y => e* 
&u = d s u 
1 dx 2 dx dt 



Verify that each of the following equations has the indicated solution for all values of a and b: 


9 y" — Q>y' + 9y ==> 0 

10 y" + 4y = 0 

11 (cos 2 x)y' + (2 sin 2 x)y — 2 

12 y" + 2y' + 2y - 0 

13 2 xy dy = (y 2 — a;) dx 

14 {xy — a; 2 ) dy y 2 dx 

16 y" + (y'Y + 1 = 0 

16 ~ = ^ 
dx 1 dt 

17 4^=^ 

dx 2 dt 2 


y — ae 3x + bxe 3x 
y = a cos 2x + b sin 2x 
y ~ a cos 2x + sin 2a: 
y = e~ x {a cos x + b sin x) 
y 2 == ax — x In |a:| 
y — ae vlx , 

y = In |cos {x — a) | +6 
u = ae~" cos (3a: + 6) 

u — af{x + 2 1) + bg(x — 2 1) 


If a and b are arbitrary constants, find a differential equation of minimum order of which each 
of the following expressions is a general solution: 


18 y = ae~ l + be‘ 

20 y — ae~ 2t + bte~ 2 ‘ 
22 y - e~ x + be 2x 
24 y ~ sin {ax + b ) 


19 y — ae~ l + be 1 + ce 2 ‘ 

21 y — 2ax + bx 2 

23 y ~ a cosh 2a: + 6 sinh 2x 


20371 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


26 Find a differential equation which has as a general solution the expression which defines 
the family of all parabolas which touch the ar-axis and have their axes vertical. 

26 Find a differential equation which has as a general solution the expression which defines 
the family of all lines which touch the parabola 2 y — x 2 . 

27 Verify that, for all values of the arbitrary constants a and h, both iji ~ ax 2 and 
y 2 = b(x - l) s satisfy each of the differential equations 

(® s - x)y" - ( 2x - 1 )y' + 2y = 0 and 2 yy" = (?/')= 


but that y = ax 2 + b(x — l) 2 will satisfy only the first of these equations. Explain. 

28 Verify that, for all values of the arbitrary constants a, b, m, n, both y i — ae mr and ijt = lw nx 
satisfy the nonlinear differential equation y"'y — y"y' - 0. Under what conditions, if any, 
does the sum y — yi + yi — ae mx + be nx also satisfy this equation ? 

29 Verify that, for all values of the arbitrary constants ci and c 2 , the differential equation 
xy' — 2y + 2 = 0 is satisfied by the function 


80 


y = 


Ci® 2 + 1 
CtX 1 + 1 


z|0 
* > 0 


Explain. 

Verify that, for all values of the arbitrary constants {cnj (n =• . . . 
the differential equation 


- 1 , 0 , 1 , 2 , . . .), 


(1 — cos x)y' — (sin x)y = 0 


is satisfied by the function 


y = c n (l — cos x) 2»7t g x < 2 (n + 1)tt 

Explain. 


Separable first-order equations 

In many cases a first-order differential equation can be reduced 
by algebraic manipulations to the form 
(!) /(x) dx - g(y) dy 

Such an equation is said to be separable, because the variables x 
and y can be separated from each other in such a way that x 
appears only in the coefficient of dx and y appears only in the 
coefficient of dy. An equation of this type can be solved at once 
by integration, and we have the general solution 

(2) ff{x) dx = fg(x) dx c 

where c is an arbitrary constant of integration. It must be borne 
in mind, however, that the integrals which appear in (2) may be 
impossible to evaluate in terms of elementary functions, and 
numerical or graphical integration may be required before this 
solution can be put to practical use. 

Other forms which should be recognized as being separable 
are 

(3) f(x)G(y) dx = F(x)g(y) dy 

(4 > S - M(x)N(y) 


SEC. 1.3 


separable first-order equations 


The general solution of Eq. (3) can be found by first dividing 
by the product F(x)G(y) to separate the variables and then 
integrating : 

[iMdv + c 
J F(x) ax J G(y ) ay + ° 

Similarly, a general solution of Eq. (4) can be found by first 
multiplying by dx and dividing by N(y) and then integrating: 

fm = I mx)dx+c 

Clearly, the process of solving a separable equation will often 
involve division by one or more expressions. In such cases the 
results are valid where the divisors are not equal to zero, but may 
or may not be meaningful for values of the variables for which the 
division is impossible. Such values require special consideration, 
and, as we shall see in the next example, may lead us to singular 
solutions. 


EXAMPLE 1 


Solve the differential equation dx + xy dy — y 1 dx + y dy. 

It is not immediately evident that this equation is separable. In any case, however, the 
best first step in solving an equation of this sort is to collect terms on dx and dy. This gives 

(1 - y 2 ) dx - y(l - x ) dy 

which is of the form (3). Hence, division by the product (1 — m)(l — ?/ 2 ) will separate the varia- 
bles and reduce the equation to the standard form (1): 

dx ydy 

l -x~ l-y* 

Now, multiplying by —2 and integrating, we obtain the following equation defining y as an 
implicit function of x: 


2 In |1 - a;| = In |1 - y* \ + c 

In this case, as in many problems of this sort, it is possible to write the solution in 
convenient form by first combining the logarithmic terms and then taking antilogs: 


ILz® L 2 _ c i 1 - z i F .!. = e o = fc 2 

U - I/ 2 l I 1 - 2/ a l 


more 


where k* — c c is .necessarily positive. Finally, clearing of fractions and eliminating the absolute 
values, we have 

(1 - *)* - ±fc 2 (l - 2/ 2 ) k 0 

The two i ossibilities here can, of course, be combined into one by writing 


(1 - z ) 2 - X(1 - »*) 

where ntnv X can take on any real value, positive or negative, except 0. The solution of the differ- 
ential equation thus defines the family of conics 


(5) 


(» ~ l) 3 

X 


+ 2/2 = 1 


X 5* 0 


typical members of which are shown in Fig. 1.2. If X > 0, the solution curves are all ellipses; if 
X < 0, the solution curves are all hyperbolas. 


ORDINARY DiffSRINTIAl. EQUATIONS OF THE FIRST ORDER 


CHAP, 1 


10 


In most practical problems a general solution of a differential equation is required to satisfy 
specific conditions which permit its arbitrary constants to be uniquely determined. For instance, 
in the present problem we might ask for the particular solution curve which passes through the 
point ( — %, l %). Substituting these values of x and y, we then have 

and the specific solution 

(6) y s - 1 + (x - l) 2 

Equation (6) defines that unique member of the family of curves (5) which passes through 
the point (~Ji, l %). However, over any interval which contains x = 1, there are many func- 
tions which satisfy the given differential equation and are such that y — when x ~ 

In fact, the upper branch of any curve of the family (5) for x > 1 can be associated with the 
upper branch of the curve (6) for i | 1 to give a function which satisfies the given equation 
and fulfills the condition that y = l % when x = This is, of course, consistent with the 
fact that, according to Theorem 1, Sec. 1,2, the uniqueness of the solution for which y ® 
when x *= —Ji can be guaranteed only over an interval around x — — % which does not con- 
tain x = 1, since y' is undefined at x ■=> 1. 

It should be noted that, in separating variables in the given equation, it was necessary to 
divide by JL — sand by 1 — i/ 2 ; hence, the possibility that x - 1 and the possibility that y ~-> ±1 
were implicitly ruled out. Therefore, had we desired the particular solution curve which passed 
through any point with coordinates of the form (l,?/o), (# 0 , 1 ), or (x»,~ 1), we could not have 
found that curve, if it existed at all, by using the general solution and particularizing X. It would 
have been necessary, instead, to return to the differential equation and search for the required 
solution by some method other than separation of variables. In this case it is obvious that the 



HOMOGENEOUS FIRST-ORDER EQUATIONS 


linear equations x - 1, y =» 1, and y » — 1 all define solutions of the given differential equa- 
tion and, moreover, satisfy, respectively, the conditions (l,j/ 0 ), (xo,l), and — None of 
these can be obtained from our general solution, although x = 1 can be included in the first 
form of it by permitting X to take on the (previously excluded) value zero. Hence y = 1 and 
y — ~1 appear as singular solutions of the given equation. 

EXERCISES 

Find a general solution of each of the following equations: 

1 . * dy ** 3y dx 2 3x z (l + y 1 ) dx - dy 

8 y dy - 2(xy + x) dx 4 ydx - 2(xy + x) dy 

6 xdy - (y s — 3y + 2) dx 3 dx +ydy *= x*y dy 

7 y s dx — xy dy - xy(dy - y dx) 8 (xy* - x) dx = (y + x*y) dy 

9 ye x+v dy » da: 10 yy" - (y') 2 

Find that particular solution of each of the following equations which satisfies the indicated 
conditions: 

11 dy = x(2y dx — xdy) x =» 1, y « 4 

12 2a; dx — dy = x(x dy — 2y dx) x = —3, y - 1 

13 Is there a solution of the equation x dy «=* 3(j/ — 1) das with the property that y = 3 when 

a; = 1 and ?/ = 9 when x = 2? Is there a solution of this equation with the property that 
y =* 3 when x = — 1 and y =» 9 when a; » 2? Explain. 

14 Find a solution of the equation (1 — a; 4 ) dy = —4a :y dx with the property that y “ 9 when 

a; ** —2, y = 2 when x = 0, and y = 0 when x = 2. 

15 Show that every solution of the equation y ' •*> ftj/ is of the form y — Ae kx . (Hint: Let y be 
any solution of the given equation, and consider the derivative of the fraction y/e hx .) 

16 A critical student watching his professor integrate the separable equation /(x) dx = g(y) dy 
objected that the procedure was incorrect, since one side was integrated with respect to x 
while the other side was integrated with respect to y. How would you answer the student's 
objection? 

17 Show that the change of dependent variable defined by the substitution v =■ ax + by + c 
will always transform the equation y' = /(ox + by + c) into a separable equation. 

Find a general solution of each of the following equations: 

18 y’ = (x - y)* 19 / - -2 + 

20 y' - (x + y - 3)* - 2(x + y - 3) 


Homogeneous first-order equations 

If all terms in the coefficient functions M(x,y ) and N(x,y) in the 
general first-order differential equation 

(1) M{x,y) dx = N(x,y) dy 

are of the same degree in the variables x and y, then either of the 
substitutions y — ux and x = vy will reduce the equation to one 
which is separable. 

More generally, if M{x,y ) and N(x,y) have the property 
that, for all positive values of X, the substitution of \x for x and 


12 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


\y for y converts them, respectively, into the expressions 
\ n M(x,y) and \ n N(x,y) 

then Eq. (1) can always be reduced to a separable equation by 
either of the substitutions y — ux and x — vy. 

Functions with the property that the substitutions 
x-*\x and y-^'ky X > 0 

merely reproduce the original forms multiplied by X” are called 
homogeneous functions of degree n. As a direct extension of this 
terminology, the differential equation (1) is said to be homo- 
geneous when M(x,y) and N(x,y) are homogeneous functions of 
the same degree. 


Is the function 

F(x,y) = a: (In y/ x* + y l — In y) + ye x,v 

homogeneous? 

To decide this question, we replace x by Xx and y by Xy, getting 
F(Xx,Xy) =» Xa:(ln y/ X 2 ** + X 2 ?/ 2 — In Xy) -f- X?/#j x * /X « 

= X:c[(ln y/x* + y 2 + In X) — (In y + In X)] + Xye xl » 

“ X[a:(ln y/x 2 + y l ~\o y) + ye* 1 *] 

= XF (x,y) 

The given function is, therefore, homogeneous of degree 1. 

If Eq. (1), assumed now to be homogeneous, is written in the 
form 

dy ^ M(x,y) 
dx N(x,y) 

it is evident that the fraction on the right is a homogeneous func- 
tion of degree zero, since the same power of X will multiply both 
numerator and denominator when the test substitutions x — > Xx 
and y-*\y are made. But if 
M(Xx,\y) _ M(x,y) 
ftf(\x,Xy) N(x,y) 

it follows, assigning to the arbitrary symbol X the value 1/* if x is 
positive and the value —1/x if>m is negative, that 


M(x,y) _ 
N(x,y) 


M(Xx,\y) 
N(Xx,\y) * 


N(l,y/x) 
} M(~l~y/x) 


x > 0 
x < 0 


In either case it is clear that the result is a function of the frac- 
tional argument y/x. Thus, an alternative standard form for a 
homogeneous first-order differential equation is 

d i = R(y\ 

dx 


( 2 ) 


SEC. 1.4 


HOMOGENEOUS FIRST-ORDER EQUATIONS 


13 


Although in practice it is not necessary to reduce a homogeneous 
equation to the form (2) in order to solve it, the theory of the 
substitution y — ux or u — y/x is most easily developed when 
the equation is written in this form. 

Now, if y — ux, then dy/dx = u -j- x(du/dx). Hence, under 
this substitution, Eq. (2) becomes 

u + * ~ = R(u) 


or 

(3) xdu = [f2(w) — u] dx 

If R(u) ss u, Eq. (2) is simply 

dy = V 
dx x 

and this is separable at the outset. If R(u) ^ u, we can divide (3) 
by the product x[R(u) — u], getting 
du _ dx 
R (u) — u x 

The variables have now been separated, and the equation can be 
integrated at once. Finally, by replacing u by its value y/x, we 
can obtain the equation defining y as a function of x. 

EXAMPLE 2 

Solve the equation (** + 3 y 2 ) dx — 2 xy dy — 0. 

By inspection, this equation is homogeneous, since all terms in the coefficient of each differ- 
ential are of the second degree. Hence, we substitute y = ux and dy •» u dx + x du, getting 

(a -3 -f 3u~x'~) dx — 2xhi (it dx 4- x du) = 0 
or, dividing by x" and collecting terms, 

(1 + m s ) dx — 2 xu du = 0 
Separating variables, we obtain 

dx __ 2 u du _ 
x 1 -j- u 2 

and then, by integrating, we find 

In |.t| — In |1 + w 2 l = c 
This can be written as 

In | — — • — - J — lri e c — In k where k = e c > 0 
Hence, |:c/(l + u 2 ) | = k) or, replacing u by y/x and dropping absolute values, 

1 4 - {y/x)' = ±k 
Finally, clearing fractions, we have 
a 3 = K Qp* + yt) 



14 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


where, from the preceding steps, it appears that K can have any real value except zero. How- 
ever, it is easy to verify by direct substitution that the function corresponding to K - 0, 
namely, x = 0, is also a solution of the given equation. Hence, in the general solution we have 
just obtained, K is actually unrestricted. 


1 Under what conditions, if any, do you think the substitution x = vy would be more con- 
venient than the substitution y — uxt 

Find a general solution of each of the following equations: 

2 (x — 3 y) dx - (3z 2 y) dy 3 {—x + 3 y) dx ~ (x + ?/) dy 

4 2 x(dx + dy) + y(dy — 5 dx) - 0 5 (a; 2 + 2 y 2 ) dx — xy dy = 0 

Find that particular solution of each of the following equations which satisfies the given 
conditions: 

6 xy dx m x*dy — p 2 dx x = 1, y = 1 

7 (3 p 8 — a 8 ) dx - 3 xy 2 dy x = 1, y = 2 

8 (x + p) 2 dx = xy dy x = 1, y - 1 

9 y dy ®» (2a; -f y) dx x - 2, y - 1 

10 xdy — y dx = + p 2 dx x = 4, y = 3 

11 (a; 8 + y*) dx — 2xy t dy x = 2, y ~ 1 


13 (a 4 + y 4 ) da: — 2x a y dy x = 1, y = 0 

14 (p 2 + 2a?2/) da: + 2a: 2 dp = 0 a; = 1, p = -2 

15 If aB bA, show that, by choosing d and D suitably, the equation 

dp _ ax + by + c 
dx Ax + By + C 

can be reduced to a homogeneous equation by the substitutions 


16 Discuss Exercise 15 in the case when aB = bA. (Hint: Recall Exercise 17, Sec. 1.3.) 

Find a general solution of each of the following equations: 

17 y' - (a; - y + 5)/(x + y - 1) 18 y' « (2a: + 2 y + l)/(3a: + y - 2) 

19 Give an example of a function which is homogeneous according to our definition but is not 
homogeneous if f(\x,\y) «* X n /(a :,p) is required to hold for all real values of X. 

20 If f(x,y) is a homogeneous function of degree n, show that 


What is the generalization of this result to functions of more than two variables? (This i: 
commonly referred to as Euler’s theorem for homogeneous functions.) 


Exact first-order equations 


Associated with each suitably differentiable function of two 
variables f(x,y) there is an expression called its total differential, 


SEC. 1.5 


EXACT FIRST-ORDER EQUATIONS 


15 


namely, 

d f = Tx ix + ai d v 

Conversely, if the differential equation 
M(x,y) dx + N(x,y ) dy ~ 0 
has the property that 

= J£ and N(x,y . ) = J£ 

then it can be rewritten in the form 
&dx + %dy = df= 0 

from which it follows that f(x,y) — k is a solution for all values of 
the constant k. Equations of this sort are said to be exact, since, 
as they stand, their left members are exact differentials. 

When M(x,y ) and N(x,y) are sufficiently simple, it is possible 
to tell by inspection whether or not there exists a function / with 
the property that 

- M(x,y) and ^ = N{x,y) 

In general, however, this cannot be done, and it is desirable to 
have a straightforward test to determine when a given first-order 
equation is exact. Such a criterion is provided by the following 
theorem: 


THEOREM 1 

If and — are continuous, then the differential equation 


dy 


M(x,y) dx + N(x,y ) dy = 0 


and 


is exact if and only if — — • 
dy dx 

PROOF To prove the theorem, let us assume first that the given equation is 
exact. Under this assumption there exists a function / such that 

m-M. 

dx 

dM = ay m = d*f 

dy dy dx and dx dx dy 

Moreover, from the familiar properties of partial derivatives, we know that, under 
our hypotheses, 

d*f = d*f 
dy dx dx dy 

Hence, and the “only if” part of the theorem is established. 

dy dx 


Hence, 



16 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


To complete the proof we must now show that, if ~~ = > then there is a 

function /such that ~ M and ~ = N. To do this, let us first integrate M (x,y) 

with respect to x, holding y fixed. This gives us the expression 
(1) f(x,y) = J* M(x,y ) dx + c(y) a arbitrary 

in which the integration "constant” is actually a function of y to be determined. 
Clearly, J~ = M(x,y); and our proof will be complete if we can determine c(y) so 

that ~ » N(x,y). 

Now, observing that, under our hypotheses, the operations of integrating 
with respect to x and differentiating with respect to y can legitimately be inter- 
changed, and recalling our supposition that 
BM_ _ dN 
dy dx 

we have, from (1), 

H=iL‘ M <- x ’V'> dx + c 'M 

= /;f 

“ f‘^ x + c ' < -y ) 

- N(x,y) - N(a,y ) + c'(y) 

Thus, ~~ will equal N(x,y), as required, if c(y) is determined so that 


c'(y) - N(a,y ) 


that is, if 


c(y) — f* N (a, y) dy & arbitrary 


We have thus shown that, if —■ = then 
’ dy dx 

f(x,y) - j* M(x,y ) dx + fj N(a,y ) dy 
is a function such that 


£ dx+ i d 

This establishes the “if” assertion of the theorem, and our proof is complete. 


df - dx + g dy M(x,y) dx + N(x,y) dy 


COROLLARY 1 

If the differential equation M(x,y) dx + N(x f y) dy — 0 is exact, then, for all 
values of k, 

j* M (x,y) dx + j* N(a,y ) dy = k 
is a solution of the equation. 



SEC. 1.5 


EXACT FIRST-ORDER EQUATIONS 


17 


EXAMPLE 1 

Show that the equation (2x + 3y — 2) dx + (3a: — 4y + 1) dy = 0 is exact, and find a general 
solution. 

Applying the test provided by Theorem 1, we find 

d JL = ^(2x + 3y - 2) = 3 and ON = d(3x - Ay + 1) _ g 
dy dlj dx dx 

Since the two partial derivatives are equal, the equation is exact. Its solution can, therefore, be 
found by means of Corollary 1, Theorem 1 : 

f* (2x + 3 y - 2) dx + J” (3a - 4y + 1) dy - k 
(: * 2 + 3 xy - 2x) |* + (3 ay - 2»/ 2 + y) = ft 
x 2 + 3 xy — 2y 2 — 2x + y = k + a 2 + 3a6 — 2b 2 — 2a + b = K 

Occasionally an equation which is not exact can be made 
exact by multiplying it by some simple expression. In fact, it can 
be shown* that every first-order equation which possesses a 
general solution can be made exact by multiplying it by a suitable 
factor, called an integrating fact or. In general, the determination 
of an integrating factor for a given equation is very difficult. 
However, as the following examples show, in particular cases an 
integrating factor can often be found by inspection. 


EXAMPLE 2 


Show that l/(x 2 + y 2 ) is an integrating factor for the equation (x 2 + y 2 — x) dx — y dy = 0. 
If the given equation is multiplied by the indicated factor, it can be rewritten in the form 




-dy = 0 


The test provided by Theorem 1 can be used to show that this equation is exact, and Corollary 1, 
Theorem 1, can be used to obtain the solution. However, it is simpler to observe that the last 
equation can also be written 


Hence, integrating, 


x — In V x 2 + y 2 = k 


EXAMPLE 3 

Find an integrating factor for the equation y dx + (x 2 y s + x) dy — 0, and solve the equation. 
Since this equation can be rewritten in the form 

(y dx + x dy) + xhy 3 dy = 0 

and since y dx + x dy ~ d(xy), it is natural to multiply the equation by 1/xhj 2 , getting 


d(xy) 

x 2 y 2 


+ ydy 


0 


See, for instance, Golomb and Shanks, op. cit., pp. 52-53. 


18 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP, 1 


This equation can now be integrated by inspection, and we have 



EXAMPLE 4 


Find an integrating factor for the equation xdy ~~ ydx *= (4** -f ?/*) dy, and solve the equation. 
In this equation, the terms on the left seem related to 

- **LZi± or, equally well, i(?) - 

If we pursue the first suggestion and multiply the equation by l/x*, we obtain 

*(i) m ( 4 + i)* 

This equation is still not exact, but it is separable, and division by 4 + y*/x* gives us 


d(y/x) 

4 + (y/x)* 


~ dy 


Integrating this, we have finally 

The results of the last three examples suggest the following 
observations, which are often helpful: 


a If a first-order differential equation contains the combination 
x dx -f- y dy, try some function of a; 2 + y* as an integrating 
factor, 

b If a first-order differential equation contains the combination 
ydx + x dy, try some function of xy as an integrating factor, 
c If a first-order differential equation contains the combination 
xdy — y dx, try l/x 2 or 1/y 2 as an integrating factor. 

EXERCISES 

Find a general solution of each of the following equations: 

1 (3a;* - 6 xy) dx - (3a;* + 2y) dy * 0 2 (y* - 1) dx + (2 xy ~ sin y) dy « 0 

3 ® xy +x*) d% + (x* + y*) dy » 0 

4 (x y/x* -f y* + y) dx + ( y y/x* + y* 4- x) dy *=* 0 

5 y(l + xy) dx ~ (x - 2 y) dy => 0 

6 3(2/0 + 1) dx -f 4a;y s dy = 0 7 (xy* - y) dx + x(xy - 1) dy « 0 

8 ydx - dy - x*y* dx + x dy 9 2 y dx + 3x dy « dx/xy 3 - dy/ y* 

Solve eaeh of the following equations by two methods: 

10 2V dx + (3 y ~ ^ dy " 0 11 (* + y)dx + (x - y)dy.**.Q 

12 y/x 2 +y z dx = xdy - ydx 13 xdy + ydx « dx/y - dy/x 

14 Show that the arbitrary constants a and b which appear in the formula of Corollary 1, 
Theorem 1, add no generality to the solution. (Hint: Consider the partial derivatives with 
respect to a and b of the left-hand member of the formula.) 


SEC. 1.6 


LINEAR FIRST-ORDER EQUATIONS 


19 


16 If <f> is an integrating factor of the equation M (x,y) dx + N{x,y) Ay — 0, show that 4> satis- 
fies the partial differential equation 



1.6 

Lineor first-order equations 


( 1 ) 


( 2 ) 

(3) 


By definition, a linear first-order differential equation cannot con- 
tain products, powers, or other nonlinear combinations of y or y'. 
Hence, its most general form is 

m % + Q Wv - n(x) 

If we divide this equation by F{x) and rename the coefficients, it 
appears in the more usual form 


% + P(*)V - «(*) 


The presence of two terms on the left side of (1) involving, 
respectively, dy/dx and y suggests strongly that this expression is 
in some way related to the derivative of a product, say <t>(x)y , 
having y as one factor. Now the derivative of </>($) y is 




and the left member of (1) can be made identically equal to this, 
provided we first multiply Eq. (1) by getting 

<t>(x) % + <t>(x)P(x)y = d>(x)Q(x) 


and then make the second terms in 
4>(x) such that 

dx 


- - <K*)P(x) 


(2) and (3) equal by choosing 


This is a simple separable equation, any nontrivial solution of 
which will meet our requirements. Hence, we can write, in 
particular, 


d<l>(x) 

4>(x) 


= P(x) dx 


In \<f>(x)\ — JP( x) dx 

<f>{x) - exp [J\P(z) dx ] f 
Thus, after Eq. (1) is multiplied by the factor 


</>0) = exp [/P(a:) dx] 


t The notation exp [/(«)] is frequently used in place of especially when 
f(x) is a complicated expression. 


22 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


increase in this amount during the infinitesimal interval of time dt. At any time t, the amount of 
salt per gallon of solution is therefore Q / 100 (lb /gal). Now the change dQ in the total amount of 
salt in the tank is clearly the net gain in the interval dt due to the fresh brine running into and 
the mixture running out of the tank. The rate at which salt enters the tank is 
5 (gal/min) X 2 (lb/gal) = 10 (lb /min) 

Hence, in the interval dt the gain in salt from this source is 
10 (lb /min) X dt (min) = 10 dt (lb) 

Likewise, since the concentration of salt in the mixture as it leaves the tank is the same as the 
concentration Q/100 in the tank itself, the amount of salt leaving the tank in the interval dt is 

5 (gal/min) X ~ (lb/gal) X dt (min) = ^ dt (lb) 

Therefore, dQ = ^10 — dt 

This equation can be written in the form 


(1) 


dQ dt 
200 - Q “ 20 


and handled as a separable equation, or it 


(2) 


dQ.Q 

dt + 20 


10 


be written 


and treated as a linear equation. 

Considering it as a linear equation, we must first compute the integrating factor 

g 5 Pit _ g/dt/20 = g(/a« 


Multiplying Eq, (2) by this factor gives 

ei/so (<* 0 . + „ iQ e tm 

\dt 20/ 

From this, by integration, we obtain 

Qe‘ m = 200e» J0 + k or Q = 200 + ke~ tli0 
Substituting the initial conditions t — 0, Q = 100, we find 


100 - 200 + k or k, = -100 
Hence, Q = 200 - 100e"" 2 » 

To find how long it will be before there is 150 lb of salt in the tank, we must find the value 
of i such that 


150 = 200 - I00e~‘ jio or = H 

From this we have at once 

t 1 

— — = In - = — In 2 = —0.693 and t - 13.9 min 

EXAMPLE 2 

A hemispherical tank of radius R is initially filled with water. At the bottom of the tank there 
is a hole of radius r through which the water drains under the influence of gravity. Find the 
depth of the water at any time t, and determine how long it will take the tank to drain completely. 


SEC. 1.7 


APPLICATIONS OF FIRST-ORDER DIFFERENTIAL EQUATIONS 


S3 


Let the origin be chosen at the lowest point of the tank, let y be the instantaneous depth of 
the water, and let x be the instantaneous radius of the free surface of the water (Fig. 1.3). 
Then in the infinitesimal interval dt the water level will fall by the amount dy, and the resultant 
decrease in the volume of water in the tank will be 

dV = *** dv 

This, of course, must equal the volume of water that leaves the orifice during the time dt. Now 
from Torricelli’s law,* * the velocity with which a liquid issues from an orifice is 

v = s/2 gh 

where g is the acceleration of gravity and y is the instantaneous height, or head, of the liquid 
above the orifice. In the interval dt, then, a stream of water of length -\/ 2 gy dt and of cross-sec- 
tion area xr s f will emerge from the outlet. The volume of this amount of water is 

dV = -nr 2 sj 2 gy dt 

Hence, equating the two expressions for dV, we obtain the differential equation 
(3) nx* dy = —jit 2 \/ 2 gy dt 

the minus sign indicating that as t increases, the depth y decreases. 

Before this equation can be solved, it is necessary that x be expressed in terms of y. This is 
easily done through the use of the equation of the circle which describes the vertical cross section 
of the tank: 

x 2 + (y — R )* = R 3 or x 2 «= 2 yR — y 2 

Using this, the differential equation (3) can be written 
ir(2yR — y 2 ) dy = — ir r 2 \/ 2gy dt 

This is a simple separable equation which can be Bolved without difficulty: 

(2j Ryte — y*&) dy — — r 2 •%/ 2g dt 
%RyM — HyW = —r 2 -\/2g t + c 


FIGURE 1.3 
A vertical plane 
section through 
the center of a 
hemispherical 
tank. 



* Named for the Italian mathematician and physicist Evangelista Torricelli 

(1608-1647). 

f This neglects the fact that the stream contracts near the orifice. How 
much the cross section of the stream decreases depends in a very complicated 
way upon the size and shape of both the tank and the orifice and also upon 
the head. However, in most practical problems reasonably accurate answers 
can be obtained by assuming that the cross section of the stream just after 
it leaves the orifice is 0.6 times the area of the orifice. 


22 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


increase in this amount during the infinitesimal interval of time dt. At any time t, the amount of 
salt per gallon of solution is therefore Q j 100 (lb/gal). Now the change dQ in the total amount of 
salt in the tank is clearly the net gain in the interval dt due to the fresh brine running into and 
the mixture running out of the tank. The rate at which salt enters the tank is 

5 (gal /min) X 2 (lb /gal) = 10 (Ib/min) 

Hence, in the interval dt the gain in salt from this source is 
10 (lb/min) X dt (min) = 10 dt (lb) 

Likewise, since the concentration of salt in the mixture as it leaves the tank is the same as the 
concentration Q/100 in the tank itself, the amount of salt leaving the tank in the interval dt is 

5 (gal/min) X ~ (lb/gal) X dt (min) » ~ dt (lb) 

Therefore, dQ - ^10 — dt 

This equation can be written in the form 


(1) 


dQ _ dt 
200 ~Q~ 20 


and handled as a separable equation, or it can be written 


( 2 ) 


« + 4-.10 

dt 20 


and treated as a linear equation. 

Considering it as a linear equation, we must first compute the integrating factor 

glPdt — . e !dtl 20 ft ! JO 


Multiplying Eq. (2) by this factor gives 

“'“(* + 1 )- 10 '"” 

From this, by integration, we obtain 

Qe t,M = 200e‘ na + k or Q = 200 + ke~ tm 
Substituting the initial conditions t = 0, Q <= 100, we find 


100 — 200 + k or k = -100 
Hence, Q = 200 - lOOe"" 80 

To find how long it will be before there is 150 lb of salt in the tank, we must find the value 
)f t such that 


150 = 200 - 100e~ W30 or e~ im = M 
from this we have at once 

t 1 

— — = In - = - In 2 = -0.693 and t = 13.9 min 

EXAMPLE 2 

L hemispherical tank of radius R is initially filled with water. At the bottom of the tank there 
3 a hole of radius r through which the water drains under the influence of gravity. Find the 
lepth of the water at any time t, and determine how long it will take the tank to drain completely. 



SEC. 1.7 


APPLICATIONS OF FIRST-ORDER DIFFERENTIAL EQUATIONS 


23 


Let the origin be chosen at the lowest point of the tank, let y be the instantaneous depth of 
the water, and let x be the instantaneous radius of the free surface of the water (Fig. 1.3). 
Then in the infinitesimal interval dt the water level will fall by the amount dy, and the resultant 
decrease in the volume of water in the tank will be 

dV — tx* dv 

This, of course, must equal the volume of water that leaves the orifice during the time dt. Now 
from Torricelli’s law,* the velocity with which a liquid issues from an orifice is 

v — ■sj 2gh 

where g is the acceleration of gravity and y is the instantaneous height, or head, of the liquid 
above the orifice. In the interval dt, then, a stream of water of length \/ 2 gy dt and of cross-sec- 
tion area xr 2 t will emerge from the outlet. The volume of this amount of water is 

dV — irr* -\/ 2 gy dt 

Hence, equating the two expressions for dF, we obtain the differential equation 
(3) irx* dy = —irr 2 •\/ r 2gy dt 

the minus sign indicating that as t increases, the depth y decreases. 

Before this equation can be solved, it is necessary that x be expressed in terms of y. This is 
easily done through the use of the equation of the circle which describes the vertical cross section 
of the tank: 

**■+ ( V - R) 1 - R 2 or x* = 2 yR - y i 
Using this, the differential equation (3) can be written 
r(2yR — y 2 ) dy - — vr 2 y/ 2 gy dt 

This is a simple separable equation which can be solved without difficulty: 

(2 RyU - pH) dy = -r 2 ^/Yg dt 
HRy % - %V H - — r 2 V2 g t + c 


FIGURE 1.3 
A vertical plane 
section through 
the center of a 
hemispherical 
tank. 



* Named for the Italian mathematician and physicist Evangelista Torricelli 
(1608-1647). 

f This neglects the fact that the stream contracts near the orifice. How 
much the cross section of the stream decreases depends in a very complicated 
way upon the size and shape of both the tank and the orifice and also upon 
the head. However, in most practical problems reasonably accurate answers 
can be obtained by assuming that the cross section of the stream just after 
it leaves the orifice is 0.6 times the area of the orifice. 


24 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


Since y = R when t — 0, we find 

WsR* - c 

and thus fiBy^ ~ — —r ti \/2gt + 

This is the equation which expresses the instantaneous depth y as a function of t. 

To find how long it will take the tank to empty, we must determine the value of l corre- 
sponding to y = 0: 

0 - -r* V2 g t + B** 

t 14 R¥i 
15 r i \/2g 

EXAMPLE 3 

The rate at which a solid substance dissolves varies directly as the amount of undissolved solid 
present in the solvent and as the difference between the instantaneous concentration and the 
saturation concentration of the substance. Twenty pounds of solute is dumped into a tank con- 
taining 120 lb of solvent, and at the end of 12 min the concentration is observed to be 1 part in 
30. Find the amount of solute in solution at any time f if the saturation concentration is 1 part 
of solute to 3 parts of solvent. 

If Q is the amount of the material in solution at time t, then 20 — Q is the amount of undis- 
solved material at that time and <2/120 is the corresponding concentration. Hence, according to 
the given law, 

f - i(2 ° -® (i - s)- ITo < 2 „ - ®(« -« 

This is a simple separable equation, and we have at once 

dQ * 

(20 — <2)(40 — <2) 120 

To integrate the left member it is convenient to use the method of partial fractions and 
write 

1 A B ^ A (40 - Q) H- £(20 - 0) 

(20 - Q)(40 - Q) " 20 - Q + 40 - Q = (20 - Q)(40 - Q) 

This will be an identity if and only if 

1 - A( 40 - Q) + £(20 - Q) 

Setting Q = 20 and Q = 40, in turn, we find from this that 

A == Mo and £ = — Mo 

Hence the differential equation can be written 

1 ( 1 1 \ ^ * , 

20\20-Q *0 - Q/ dQ ~ I20 dt 

and, integrating, we have 

W - In (20 - Q) + In (40 - Q) ^ f + c 

When t = 0, the amount Q = Q 0 of dissolved material is zero. Hence 
— In 20 + In 40 = c or c «* In 2 



SEC. 1.7 


APPLICATIONS OF FIRST-ORDER DIFFERENTIAL EQUATIONS 


25 


and Eq. (4) can be written 


(5) 


In 


40 - Q 
2(20 - Q) 


To find k we use the fact that when l — 12, the concentration Q/120 is Koj or Q — 4. Hence, 
substituting these values, 


In *% 2 = 2 k or k = Kin % = 0.05889 

Passing to exponential form from Eq. (5), in order to solve for Q, we have 


and finally 


Q 


40-Q 
40 - 2Q 

40 - 40e°- 0!>a8 ‘ 
1 — 2e 0 • 00B8, 


e 0.00>8( 

40(1 - e~ 0Mm ) 

2 g-0.00984 


EXERCISES 

1 Under certain conditions it is observed that the rate at which atmospheric pressure changes 
with altitude is proportional to the pressure. If the pressure is 14.7 lb/in. s at sea level and 
if it has fallen to one-half this value at 18,000 ft, find the formula for the pressure at any 
height. 

2 Although water is often assumed to be incompressible, it actually is not. In fact, using 
pounds and feet as units, the weight of a cubic foot of water under pressure p is approxi- 
mately to(l + kp) where w — 64, k = 2 X 10~ 8 , and p is measured from standard atmos- 
pheric pressure as an origin. Using this information, find the pressure at any depth y below 
the surface of the ocean. At a depth of 6 miles, by what factor does the actual pressure 
exceed the pressure computed on the assumption that water is incompressible? 

3 Radium disintegrates at a rate proportional to the amount of radium instantaneously 
present. If one-half of any given amount of radium will disappear in 1,590 years, what 
fraction will disintegrate during the first century? during the tenth century? 

4 According to Lambert’s law of absorption,* when light passes through a transparent me- 
dium, the amount absorbed by any thin layer of the material is proportional to the amount 
incident on that layer and to the thickness of the layer. In his deep-sea explorations off 
Bermuda, Beebe observed that at a depth of 50 ft the intensity of illumination was 10 can- 
dles/ft 2 , and that at 250 ft it had fallen to 0.2 candle/ft 4 . Find the law connecting intensity 
with depth in this case. 

6 It is a fact of common experience that, when a rope is wound around a rough cylinder, a 
small force at one end can resist a much larger force at the other. Quantitatively, it is 
found that, throughout the portion of the rope in contact with the cylinder, the change in 
tension per unit length is proportional to the tension, the proportionality constant being 
the coefficient of friction between the rope and the cylinder divided by the radius of the 
cylinder. Assuming a coefficient of friction of 0.35, how many times must a rope be snubbed 
around a post 1 ft in diameter in order that a man holding one end can resist a force 200 
times greater than he can exert? 

6 When ethyl acetate in dilute aqueous solution is heated in the presence of a small amount of 
acid, it decomposes according to the following equation: 

CHaCOOCaHs + H*0 -> CH»COOH + C 3 H B OH 

ethyl acetate water acetic acid ethyl alcohol 


* Named for the German mathematician and astronomer Johann Heinrich 
Lambert (1728-1777). 


26 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


Since this reaction takes place in dilute solution, the quantity of water present is so great 
that the loss of the small amount which combines with the ethyl acetate produces no appre- 
ciable change in the total amount. Hence, of the reacting substances only the ethyl acetate 
suffers a measurable change in concentration. A chemical reaction of this sort, in which the 
concentration of only one reacting substance changes, is called a first-order reaction. It is 
a law of physical chemistry that the rate at which a substance is being used up, i.e., trans- 
formed, in a first-order reaction is proportional to the amount of that substance instan- 
taneously present. If the initial concentration of ethyl acetate is Co, find the expression for 
its instantaneous concentration at any time t. 

7 In some chemical reactions where two substances combine to form a third, the amount of 
each of the reacting substances changes appreciably. In such cases it is observed that the 
rate at which the resulting compound is formed is proportional to the product of the un- 
transforxned amounts of the two reacting substances. If two substances combine in the 
ratio 1:2, by weight, to form a third substance, and if it is observed that, 10 min after 
10 grams of the first substance and 20 grams of the second are mixed, the amount of the 
product which has been formed is 5 grams, find an expression for the amount of the product 
present at any time. 

8 Work Exercise 7 given that, instead of 10 grams of the first substance and 20 grams of the 
second, 20 grams of each substance are mixed. 

9 A mothball loses mass by evaporation at a rate proportional to its instantaneous surface 
area. If half its mass is lost in 100 days, how long will it take its radius to decrease to half 
its' initial value? How long will it be before the mothball disappears completely? 

10 When a volatile substance is placed in a sealed container, molecules leave its surface at a 
rate proportional to the area of the surface and return at a rate proportional to the amount 
which has evaporated. If a volatile material is spread evenly to a depth h over the bottom 
of a closed box, find the depth of the material at any time. Under what conditions, if any, 
will all the material eventually evaporate? 

11 A rapidly rotating flywheel, after power is shut off, "coasts” to rest under the retarding 
influence of a friction torque which is proportional to the instantaneous angular velocity «. 
If the moment of inertia of the flywheel is 7 and if its initial velocity is wo, find its instan- 
taneous angular velocity as a function of time. How long will it take the flywheel to come 
to rest? (Hint: Use Newton’s law in torsional form, 

Moment of inertia X angular acceleration — torque 


to set up the differential equation describing the motion.) 

12 The friction torque acting to slow down a flywheel is actually not proportional to the first 
power of the angular velocity at all speeds. As a more realistic example than Exercise 11, 
suppose that a flywheel of moment of inertia I — 7.5 lb-ft sec 2 coasts to rest from an initial 
speed of 1,000 rad /min under the influence of a retarding torque T estimated to be the 
following: 



0 < w < 100 rad/min 
100 < <o < 1,000 rad/sec 


Find to as a function of t, and determine how long it will take the flywheel to come to rest. 

13 A body weighing w lb falls from rest under the influence of gravity and a retarding force 
due to air resistance, assumed to be proportional to the velocity. Find the equations expres- 
sing the velocity of fall and the distance fallen, as functions of t, and verify that these 
reduce to the ideal laws 


gt and s = }.{gt s 


SEC. 1.7 


APPLICATIONS OF FIRST-ORDER DIFFERENTIAL EQUATIONS 


27 


when the coefficient of air resistance approaches zero. (Hint: Use Newton’s law, 

Mass X acceleration = force 

to set up the differential equation which describes the motion.) 

14 Work Exercise 13, given that the retarding force due to air resistance is proportional to the 
square of the velocity of fall. 

16 A body falls from rest from a height so great that the fact that the force of gravity varies 
inversely as the square of the distance from the center of the earth cannot be neglected. 
Find the equations expressing the velocity of fall and the distance fallen as functions of f in 
the ideal case in which air resistance is neglected. [Hint: dv/dt = ( dv/dy ) ( dy/dt ) = v(dv/dy).] 

16 Under the conditions of Exercise 15, determine the minimum initial velocity with which a 
body must be projected upward if it is to leave the earth and never return. 

17 A particle of mass m moves along the a:-axis under the influence of a force which is directed 
toward the origin and proportional to the distance of the particle from the origin. If the 
body starts from rest at the point where x — xo, find the equations that express its velocity 
and its distance from the origin as functions of t. (Note the hint to Exercise 15.) 

18 A tank contains 100 gal of brine in which 50 lb of salt is dissolved. Brine containing 2 lb /gal 
of salt runs into the tank at the rate of 3 gal/min, and the mixture, assumed to be kept 
uniform by stirring, runs out of the tank at the rate of 2 gal/min. Assuming the tank 
sufficiently large to avoid overflow, find the amount of salt in the tank as a function of f. 

19 Work Exercise 18 with the rates of influx and efflux interchanged. 

20 Work Example 3 given that the saturation concentration is 1 part of solute to 12 parts of 
solvent. 

21 Work Example 3 given that the saturation concentration is 1 part of solute to 6 parts of 
solvent. 

22 Work Example 3, with concentration defined as the ratio of solute to solution instead of 
solute to solvent. 

23 According to Newton’s law of cooling, the rate at which the temperature of a body decreases 
is proportional to the difference between the instantaneous temperature of the body and the 
temperature of the surrounding medium. If a body whose temperature is initially 100°C 
is allowed to cool in air which remains at the constant temperature 20°C, and if it is ob- 
served that in 10 min the body has cooled to 60°, find the temperature of the body as a 
function of time. 

24 A tank and its contents weigh 100 lb. The average heat capacity of the system is 0.5 Btu/ 
(Ib)(°F). The liquid in the tank is heated by an immersion heater which delivers 100 
Btu/min. Heat is lost from the system at a rate proportional to the difference between the 
temperature of the system, assumed constant throughout at any instant, and the tempera- 
ture of the surrounding air, the proportionality constant being 2 Btu/ (min) (°F). If the air 
temperature remains constant at 70° and if the initial temperature of the tank and its 
contents is 55°, find the temperature of the tank at any time. 

26 According to Fourier’s law of heat conduction, the amount of heat in Btu per unit time 
flowing through an area is proportional to the area and to the temperature gradient, in 
degrees per unit length, normal to the area. On the basis of this law, obtain a formula for 
the amount of heat lost per unit time from 1 ft* of furnace wall h ft thick, if the temperature 
in the furnace is To and if the air temperature outside the furnace is Tj. What is the tem- 
perature distribution through the furnace wall? 

26 Using Fourier’s law of heat conduction, obtain a formula for the amount of heat lost per unit 
time from l ft of pipe of radius r 0 carrying steam at temperature To if the pipe is covered 
with w in. of insulation, the outer surface of which remains at the constant temperature T i. 
What is the temperature distribution through the insulation? 

27 The inner and outer surfaces of a hollow sphere are maintained at the respective tempera- 
tures To and Ti. If the inner and outer radii of the spherical shell are ro and fi, find the 


28 


ORDINARY DIFFERENTIAL EQUATIONS OF THE FIRST ORDER 


CHAP. 1 


amount of heat lost from the sphere per unit time. What is the temperature distribution 
through the shell? 

28 When a condenser of capacity C is being charged through a resistance R by a battery which 
supplies a constant voltage E, the instantaneous charge Q on the condenser satisfies the 
differential equation 


R it + C 


Find Q as a function of t if the condenser is initially uncharged, How long will it be before 
the condenser is half charged? 

29 When a switch is closed in a circuit containing a resistance R, an inductance L, and a 
battery which supplies a constant voltage E, the current i builds up at a rate defined by 
the relation 

L^ + Ri~ E 
dt 


Find i as a function of t. How long will it take i to reach one-half of its final value? 

80 In Exercise 28, find Q as a function of t if the battery is replaced by a generator which 
supplies an alternating voltage equal to Eo sin ut. 

31 In Exercise 29, find i as a function of t if the battery is replaced by a generator which sup- 
plies an alternating voltage equal to Eg cos ut. What is the phase difference between the 
impressed voltage and the resultant current after the current has been flowing a long time? 

32 A vertical cylindrical tank of radius r is filled with liquid to a depth h. When the tank is 
rotated about its axis, centrifugal force tends to drive the liquid outward from the center 
of the tank. Under steady conditions of rotation with constant angular velocity «, find the 
equation of the curve in which the free surface of the liquid is intersected by a plane through 
the axis of the cylinder, assuming the tank to be sufficiently deep that no liquid is spilled 
over the edge. 

83 A weight W is to be supported by a column having the shape of a solid of revolution, If 
the material of the column weighs p lb /ft 3 , and if the radius of the upper base of the column 
is to be ro, determine how the radius of the column should vary in order that at all cross 
sections the load per unit area will be the same. 

84 Work Example 2 if the tank has the shape of an inverted right circular cone of radius R 
and height h. 

36 Work Example 2 if the tank has the shape of a vertical right circular cylinder of radius R 
and height h and if, in addition to a hole of radius r in the bottom, there is also a hole of 
radius r in the side at a distance of h/2 above the base. 

36 A cylindrical tank is l ft long and has semicircular end sections of radius r ft. The tank is 
placed with its axis horizontal and is initially filled with water. How long will it take the 
tank to drain through a hole of area a ft 2 in the bottom of the tank? 

37 A vertical cylindrical tank of radius r and height h has a narrow crack of width w running 
vertically from top to bottom. If the tank is initially filled with water and allowed to drain 
through the crack under the influence of gravity, find the instantaneous depth of the water 
in the tank as a function of t. How long will it take the tank to empty? (Hint : First imagine 
the crack to be a series of adjacent orifices, and integrate to find the total efflux from the 
crack in the infinitesimal interval dt.) 

88 Water flows into a vertical cylindrical tank of cross-section area A ft 2 at the rate of Q 
ft 3 /min. At the same time the water flows out under the influence of gravity through a hole 
of area a ft 2 in the base of the tank. If the water is initially h ft deep, find the instantaneous 
depth as a function of t. 

39 If two families of curves have the property that each member of either family cuts every 
member of the other family at right angles, the curves of either family are said to be 



SEC. 1.7 


APPLICATIONS OF FIRST-ORDER DIFFERENTIAL EQUATIONS 


29 


orthogonal trajectories of the curves of the other family. Find the orthogonal trajectories of 
the curves of the family 2a; 2 + y 2 = kx. [Hint: Show that, at a general point (x,y), the slope 
of that curve of the given family which passes through that point is given by the formula 


y' = 


-2 x 2 +y* 
2 xy 


and then find the curves whose slopes are given by the negative reciprocal of this expression.] 
40 Find the orthogonal trajectories of the curves of the family 



CHAPTER TWO 


Linear Differential 
Equations 
with Constant 
Coefficients 


The genera! linear second-order equation 

The general linear differential equation of the second order can be 
written in the standard form 

(1) y" + P(x)y' 4- Q(z)y - R(x) 

where P, Q, and R are known functions. Clearly, no loss of 
generality results from taking the coefficient of y" to be unity, 
since this can always be accomplished by division. Because of the 
presence of the term R(x), which is unlike the other terms in that 
it does not contain the dependent variable y or any of its deriva- 
tives, Eq. (1) is said to be nonhomogeneous. If R(x) is identically 
zero, we have the so-called homogeneous equation* 

(2) y" 4- P(x)y' + Q(x)y = 0 

In general, neither Eq. (1) nor Eq. (2) can be solved in terms 
of known functions. The theory associated with such special cases 
as have been studied at length is, for the most part, very difficult. 
At this stage we shall consider in detail only the simple, though 
highly important, case in which P(x) and Q(x) are constants. 
However, both as an illustration of how certain properties of the 
solutions of a differential equation can be established even though 
the form of those solutions is unknown and also because we shall 
have need of the results themselves, we shall begin by proving 
three fundamental theorems pertaining to the solutions of the 
general equations (1) and (2). 


* It is regrettable that in describing linear equations of all orders the word 
homogeneous should be used in a manner totally unlike its use in describing 
equations of the first order (Sec. 1.4). The usage is universal, however, and 
must be accepted. 



SEC. 2.1 


THE GENERAL LINEAR SECOND-ORDER EQUATION 


31 


THEOREM 1 

If yi and y 2 are any solutions of the homogeneous equation 
y" + P(x)y' + Q(x)y = 0 

then y 3 — ciyi + c 2 y 2 , where ci and C 2 are arbitrary constants, is also a solution. 

PROOF To establish this theorem, it is necessary only to substitute the 
expression for y 3 into the given differential equation and verify that it is satisfied : 

y" + P(x)y' t + Q{x)y z = (ci y x + c 2 y 2 )" + P(x)(ciyi + c 2 y 2 )' 

+ Q(x)(ciy i + c 2 y 2 ) 
= (ciyi + c 2 y") + P(x)(ciy[ + c 2 y 2 ) 

+ Q(x)(ciyi + c 2 y 2 ) 
= \y" + P(x)y[ + Q(x)yi]a 

+ t V ” + P(x)y 2 4- Q(x)y 2 ]c 2 

= 0 • ci + 0 • C 2 = 0 

where the coefficients of Ci and C 2 vanish identically because, by hypothesis, both 
•y 2 and y 2 are solutions of the homogeneous equation (2). 

Theorem 1 assures us that, if we have two solutions of the 
homogeneous equation (2), then we can obtain infinitely many 
other solutions simply by forming arbitrary linear combinations 
of these two. However, it leaves completely unanswered the 
important question of whether or not all solutions of (2) can be 
obtained from the pair (yi,y 2 ) in this fashion. To decide this point 
we need the stronger result contained in the next theorem. 

THEOREM 2 

If y i and y 2 are two solutions of the homogeneous equation 
y" + P(x)y' + Q(x)y = 0 

for which 

W(y h y 2 )-\ m\ y ) v j\ = y iy ' s - y 2 y[ ^ 0 
l y i 2/2 1 

and if fP(x) dx exists, then there exist constants Ci and c 2 such that any solution 
y 3 of the homogeneous equation can be expressed in the form y 3 — ciyi + c 2 y 2 . 

PROOF To prove this theorem, it is convenient to show first that any pair of 
solutions of Eq. (2), say y,- and y,-, satisfies the relation 

(3) W( yi ,yj) = yiy] - y^ = kq exp [~JP(x) dx] 

where /c,-,- is a suitable constant. To establish this, we begin with the hypothesis 


t The symbol W(y\,yt) is customarily used to denote this combination of two 
functions, in honor of Hoen6 Wronsky (1778-1853), Polish poet and mathe- 
matician who was one of the first to study determinants of this type. Such 
determinants are usually referred to as Wronskians. 


32 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


that both yi and y y are solutions of (2) and, hence, that 
y'i + P(x)y'i + Q(x)yi = 0 
y" + P(z)Vj + Q(x)Vj = 0 

If the first of these equations is multiplied by yj and subtracted from ?/»• times the 
second, we obtain 

(4) (yiy" ~ Viv”) + P(x)(yiVj ~ ViVi) — 0 

Now, = (y, V '/ + vM - + V,V7) 

= (y<y" - ViVi) 

Hence, Eq. (4) can be written 

d J^M. + PMWtoM). = 0 

This is a very simple, separable differential equation whose solution can be 
written down immediately: 

W(yi,yj) = hj exp [~JP(x) dx}\ 

where fc# is an integration constant. This establishes the relation (3), which is 
usually known as Abel’s identity, after the Norwegian mathematician Niels Abel 
(1802-1829). 

Now consider the two pairs of solutions (y 3 ,y x ) and (y 3 ,yi), where y 3 is any 
solution whatsoever of the homogeneous equation (2). Applying Abel’s identity 
(3) to each of these pairs in turn, we have 

Mi ~ Vm = *31 exp [~JP(z) dx] 

2/32/2 ~ 2/22/3 “ *32 exp [—$P(x)dx] 

In general it is possible to solve these two simultaneous equations for y 3 , getting 


= 2/1*55 ex P [-/P(g) ~ 2/2*31 exp [-/jP(ic) dx ] 

V * ViVi ~ 2/22/1 

If we now apply Abel’s identity to the denominator of the last expression, we 
obtain 


■ yi*s2 exp [— JP(x) dx] — 2/2/C31 exp [- JP(x) dx] 


*i2 exp [— fP(x) da:] 


*32 

k„ Vl ' 


*51 7 
h 2 1 


Interpreting k 3 2 /* 12 as c x and —k 3 i/ k n as c 2 , we have thus succeeded in exhibiting 
any solution y 3 as a linear combination c x y x + c 2 y% of the two particular solutions 
yi and y 2 , provided only that the expression 

2/12/z - 2/22/1 = W(y h 2/2) 

by which we had to divide in order to solve for y 3 , does not vanish. Theorem 2 is 
thus established. 


f Since an exponential function can never vanish, it follows that wherever 
fP(x)dx exists, the Wronskian of and y, is either never zero or identically 
zero, according as ka 3^ 0 or *,•/ == 0. 



SEC. 2.1 


THE GENERAL LINEAR SECOND-ORDER EQUATION 


33 


From Theorem 2 it is clear that to find a complete solution 
of Eq. (2) we must first find two particular solutions that have a 
nonvanishing Wronskian, or in other words are linearly independ- 
ent (Exercise 6), and then we must form a linear combination of 
these solutions with arbitrary coefficients. We must remember, 
however, that, although there are infinitely many pairs of par- 
ticular solutions yi and y 2 which can be used as a basis for con- 
structing a complete solution of Eq. (2), neither Theorem 1 nor 
Theorem 2 tells us how to find them. In fact there is no general 
method for solving Eq. (2)* and the only procedure applicable in 
all cases is one which permits us to determine a second, independ- 
ent solution when one solution is known. 

To develop this procedure, let us suppose that i/i(x) ^ 0 is a 
solution of Eq. (2), and let us attempt to find a function with 
the property that cj>(z)yi(x) is also a solution of (2). Substituting 
y — <f>(x)yi(x) into Eq. (2), we have 

( y'i4> + 2 y[<j}' + yi<f>") + P(x)(y[<f> 4- yi<t>') 4- Q(x)(yi<t>) 

= [y” + P(x)y[ + Q(x)yi](f> + [2 y\ + P{x)yiW 4- yi<t>" l 0 

Now, the coefficient of <t> in the last expression is identically zero, 
since, by hypothesis, jq is a solution of Eq. (2). Hence, the last 
equation will be satisfied provided 4> is chosen such that 

y\<t>" + [2 y[ -f- P(x)yi]<f>' - 0 

This is a simple separable equation in <t>', and we have 

% + [^ + PW ] <fc = 0 

or, integrating, 

In |<£'| -f 2 In |y a | + JP(x) dx = In |c| 

Hence, combining the logarithms and taking antilogs, 

,, c exp [— fP(x) dx] 

* = p? 

Integrating again, we find 

f^pjKTPW * ]_ d k 

J y i 2 

from which we obtain, for all values of c and k, the solution 

(5) <j>(x)yi(x) - cy 1 (x) j exp ^ ^ dx 4- %ifa) 

Since this contains two arbitrary constants, it is actually a com- 
plete solution, provided that the two particular solutions from 


* The nearest thing to a general solution procedure is the use of infinite 
series, described in Sec. 9.1. 


34 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


which it is constructed, namely, 

and 

have a nonvanishing Wronskian. It is not difficult to show that 
this is always the case, although we shall leave the proof as an 
exercise. 


EXAMPLE 1 

Find a complete solution of the equation x*y" 4 xy' — 4 y = 0, given that y = x 2 is one solution. 
Substituting the assumed solution y = x 2 <i> into the given differential equation, we have 

a; 2 (2<£ 4 4:r</>' 4 x*^") 4 x(2x<t> 4 xV) — 4 x 2 <j> = 0 

or, simplifying, 

x<j>" -(- 5^' == 0 

Separating variables, we obtain 


<4 .5 
— - + - dx = 0 
<t> x 

Then, integrating, we get 

In 4 5 In ]x\. = In |c| or 
and finally, integrating again, 


The complete solution is, therefore, 


V =* *4 “= - + kx* 

4X ! 

The solution of the nonhomogeneous equation (1) is based on 
the following theorem: 

THEOREM 3 

If F is any solution of the nonhomogeneous equation 
V" + P(x)y' 4- Q(x)y = R(x) 

and if c x y x 4 ay* is a complete solution of the homogeneous equation obtained 
from this by deleting the term R(x), then y — ayi + c 2 y 2 + Y is a complete 
solution of the nonhomogeneous equation. 

PROOF Let y be any solution whatsoever of the nonhomogeneous equation (1). 
Then 

y" 4- P(x)y' 4- Q(x)g - R(x) 
and, similarly, since F is also a solution of (1), 

F" + P(x)Y> + Q(x)Y = R(x) 



SEC. 2.1 


THE GENERAL LINEAR SECOND-ORDER EQUATION 


35 


If we subtract the last two equations, we obtain 

(y" - Y") + P(z)(y' - F') + Q(x)(y — Y) — 0 
or (y - Y)" + P(x)(y - Y)' + Q(x)(y - Y) = 0 

Thus the quantity y — F satisfies the homogeneous equation (2) and, hence, by 
Theorem 2, must be expressible in the form 

V ~ Y = ciyi + c. 2 y 2 

provided that W(yi,y 2 ) ^ 0; that is, provided that c x yi + c 2 y 2 is a complete solu- 
tion of (2), as we assumed. Therefore, transposing, 

V = ciyi + c 2 y 2 + Y 

Since y was any solution of the nonhomogeneous equation (1), Theorem 3 is thus 
established. 

The term F, which can be any solution of (1) no matter how 
special, is called a particular integral of the nonhomogeneous 
equation. The expression ciyi 4- c 2 y 2 , which is a complete solution 
of the homogeneous equation corresponding to (1), is called the 
complementary function of the nonhomogeneous equation. The 
steps to be carried out in solving an equation of the form (1) can 
be summarized as follows: 

a Delete the term R(x) from the given equation, and then find 
two solutions of the resulting homogeneous equation which 
have a nonvanishing Wronskian. Then combine these to form 
the complementary function c\y\ + c 2 y 2 of the given equation, 
b Find one particular solution F of the nonhomogeneous equa- 
tion itself. 

c Add the complementary function c x yx + c 2 y 2 found in step a to 
the particular integral F found in step b, to obtain the complete 
solution y = dyi + c 2 y 2 + F of the given equation. 

In the following sections we shall investigate how these 
theoretical steps can be carried out when P(x) and Q{x) are 
constants; that is, when we have the so-called linear differential 
equation with constant coefficients. 

EXERCISES 

1 Using the one solution indicated, find a complete solution of each of the following equations: 

a v" 4 y ~ 0 2 /x — siti a: 

b y" + 3 y' + 2y = 0 y x ~ e~ x 

c (1 — 2 x)y" + 2y' + (2a: — S)y — 0 y x =e x 

d (2x — x*)y" + 2(x — l)y' — 2y = 0 y x — x — l 
e sPy" + 4 xy' — 4y = 0 y x = x 

i x*y" - (x 2 + 2 x)y' + (a: + 2 )y = 0 Vi - x 

2 Verify that each of the following equations has the indicated solutions, and in each case 
construct two different complete solutions: 

a y” — y = 0 y, = y t = e~ x 

b y" — 3 y' -f 2 y - 0 y x =e x y t = e 2 * 

c y u + y = 0 y x - sin (x + ir/4) y 2 - sin (x — ir/4) 

d y" — 2(cot 2 x)y' = 0 y x - sin* x y% = cos 2 x 



34 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


which it is constructed, namely, 

vM I — dx and Vl (x) 

have a nonvanishing Wronskian. It is not difficult to show that 
this is always the case, although we shall leave the proof as an 
exercise. 


EXAMPLE 1 

Find a complete solution of the equation x*y" + xy' — 4 y - 0, given that y — * s is one solution. 
Substituting the assumed solution y — x 2 $ into the given differential equation, we have 

x s (2<£ -f- 4x<t>' -f- x 2 <j>") -f- x(2 xrp d - x 2 4>') — 4a; 2 $ — 0 

or, simplifying, 

x<f>” + 5 4>' — 0 

Separating variables, we obtain 


d<j>' 


+ 


-dx~0 

x 


Then, integrating, we get 


In |4>'| + 5 In \x\ == In |c| or 
and finally, integrating again, 


The complete solution is, therefore, 


y <m x 2 <j> = 


The solution of the nonhomogeneous equation (1) is based on 
the following theorem: 


THEOREM 3 

If Y is any solution of the nonhomogeneous equation 
v" 4- P(x)y' + Q(x)y = R(x) 

and if ayi + c 2 y 2 is a complete solution of the homogeneous equation obtained 
from this by deleting the term R (x) ! then y — ayi -f- c 2 y 2 + F is a complete 
solution of the nonhomogeneous equation. 

PROOF Let y be any solution whatsoever of the nonhomogeneous equation (1). 
Then 

y" + P(x)y' -f Q(x)y - R(x) 
and, similarly, since Y is also a solution of (1), 

Y" + P( X )Y' + Q(x)Y = R(x) 


SEC. 2. 


THE GENERAL LINEAR SECOND-ORDER EQUATION 


35 


If we subtract the last two equations, we obtain 

(r ~ Y") + P(xW - Y') + Q(x)(y - F) = 0 

or (y - Y)" + P{x){y - Y)' + Q(x)(y - Y) = 0 

Thus the quantity y — Y satisfies the homogeneous equation (2) and, hence, by 

Theorem 2, must be expressible in the form 

V ~ Y = ciyi 4- c 2 y 2 

provided that W{yt,y 2 ) ^ 0; that is, provided that c^i + c 2 y 2 is a complete solu- 
tion of (2), as we assumed. Therefore, transposing, 
y - ciyi + c 2 y 2 + Y 

Since y was any solution of the nonhomogeneous equation (1), Theorem 3 is thus 
established. 

The term F, which can be any solution of (1) no matter how 
special, is called a particular integral of the nonhomogeneous 
equation. The expression ciyi + c 2 y 2> which is a complete solution 
of the homogeneous equation corresponding to (1), is called the 
complementary function of the nonhomogeneous equation. The 
steps to be carried out in solving an equation of the form (1) can 
be summarized as follows: 

a Delete the term R(x) from the given equation, and then find 
two solutions of the resulting homogeneous equation which 
have a nonvanishing Wronskian. Then combine these to form 
the complementary function c%y\ + c 2 y 2 of the given equation, 
b Find one particular solution F of the nonhomogeneous equa- 
tion itself. 

c Add the complementary function Ciyi + c 2 y 2 found in step a to 
the particular integral F found in step b, to obtain the complete 
solution y — c\y\ + c 2 y 2 + F of the given equation. 

In the following sections we shall investigate how these 
theoretical steps can be carried out when P(x) and Q(x) are 
constants; that is, when we have the so-called linear differential 
equation with constant coefficients. 

EXERCISES 

1 Using the one solution indicated, find a complete solution of each of the following equations : 

a y" + y = 0 yi = sin x 

b y" + 3^' + 2y = 0 y\ — e~* 

c (1 — 2 x)y" + 2 y' + (2x — 3)y — 0 yx = e x 

d ( 2x — x 2 )y" -f 2(x — 1 )y r — 2y == 0 yi = x — 1 
e x 2 y" + 4 xy' — 4ty — 0 yi = x 

f xhj" - (x 2 + 2x)y' -)- (x + 2)y = 0 yi = x 

2 Verify that each of the following equations has the indicated solutions, and in each case 
construct two different complete solutions: 

a y" — y - 0 y l=e x y% = e -» 

b y" — Zy' + 2y ~ 0 yi=e x y t - e 2 * 

c y" + y = 0 yi - sin (x + ir/4) yt = sin (x — ir/4) 

d y" — 2(cot 2 x)y' == 0 y x = sin 2 x yt — cos* x 



LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


8 Show that the two solutions 


?/i and 


r «p[-JPW<fa1 * 


of the equation j/" + P(x)y' + Q(x)y = 0 have a nonvanishing Wronskian. 

4 If the Wronskian of two functions is different from zero at every point of an interval, show 
that there is no point of the interval at which the functions are simultaneously zero. 

6 If the Wronskian of two functions is different from zero at every point of an interval, show 
that there is no point of the interval at which either function has a repeated zero. 

6 Show that, if two differentiable functions are linearly dependent, their Wronskian is equal 
to zero. 

7 Show that the converse of the assertion of Exercise 6 is false. Hint : Consider, for — oo < 
x < oo, the following pair of functions: 



— co < x Si 0 
0 < x < « 


8 Show that the converse of the assertion of Exercise 6 is true over any interval on which the 
two functions have no common zero. 

9 Show that, if the Wronskian of two functions is different from zero at every point of an 
interval, then, between any two consecutive zeros of either of the functions in that interval, 
there is exactly one zero of the other function. [Hint: Let yi and yt be the two functions, 
let a and b be consecutive zeros of y\, apply Rolle’s theorem to the quotient y\/y« over the 
interval (a,h), and note the contradiction unless // 2 = 0 at some point between a and b.] 

10 Explain how Abel’s identity can be used to find a second solution of the equation y" -f 
P(x)y' + Q{x)y — 0 when one solution is known. Illustrate the method by applying it to 
parts a and b of Exercise 1. 


2.2 

The homogeneous Hinecsr equation with constant coefficients 

When P(x) and Q(x) are constants, the general linear second- 
order differential equation can be written in the standard form 
(!) ay" + W 4 ■ cy ~ f(x) 


A second standard form which is often encountered is based 
upon the so-called operator notation. In this, the symbol of 
differentiation d/dx is replaced by D, so that, by definition, 


Dy m 


dy} 

dx 


As an immediate extension, the second derivative, which, of 
course, is obtained by a repetition of the process of differentiation, 
is written 


D(Dy) = D*y 


t Just a .s the prime notation, ?/, y", . . . , may in specific instances indicate 
derivatives with respect to x, l, or any other independent variable, so the 
operator notation, Dy, D 2 y, . . , may also indicate derivatives with 

respect to an independent variable other than x, depending on the context. 


SEC. 2.2 


THE HOMOGENEOUS LINEAR EQUATION WITH CONSTANT COEFFICIENTS 


37 


Similarly, g - D(D’y) - D‘y 
g = D(D‘y) - D<y 

Evidently, positive integral powers of D (which are the only ones 
we have defined) obey the usual laws of exponents. 

If due care is taken to see that variables are not moved across 
the sign of differentiation by a careless interchange of the order of 
factors containing variable coefficients, the operator D can be 
handled in many respects as though it were a simple algebraic 
quantity. For instance, after defining (aD 2 + &D -j- c)jf(x) to 
mean aD 2 /(* ) 4- 5D/(x) + cf(x), we have, for the polynomial 
operator 3 D 2 — 1QD — 8 and its factored equivalents, 

(3D 2 - 10D - 8)* 2 = 3(2) - 10(2*) - 8(* 2 ) = 6 - 20* - 8x 2 
(3D + 2)(D - 4)x 2 = (3D + 2) (2* - 4* 2 ) 

= (6 - 24x) + (4* - 8x 2 ) = 6 - 20* - 8x 2 
(D - 4) (3D + 2)x 2 = (D — 4) (6* + 2* 2 ) 

= (6 + 4*) - (24* + 8* 2 ) = 6 - 20* - 8x 2 
which illustrates how algebraically equivalent forms of an oper- 
ator yield identical results when applied to the same function. 

Using the operator D, we can evidently write Eq. (1) in the 
alternative standard form 
(la) (aD 2 + bD + c)y — /(*) 

Many writers base the solution of Eq. (1) upon the oper- 
ational properties of the symbol D. However, we shall postpone 
all operational methods until the chapter on the Laplace trans- 
formation, where operational calculus can be developed easily 
and efficiently in its proper setting. 

Following the theory of the last section, we first attempt to 
find a complete solution of the homogeneous equation 

(2) ay" + by' + cy = 0 

or 

(2a) (aD 2 + bD + c)y = 0 

obtained from (1) or (la) by deleting /(*). In searching for par- 
ticular solutions of (2), it is natural to try 

y = e m * 

where m is a constant to be determined, because all derivatives of 
this function are alike except for a numerical coefficient. Substi- 
tuting into Eq. (2) and then factoring e mx from every term, we 
have 


e“(aw 2 + bin + c) — 0 



LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


38 


(3) 


(4) 


as the condition to be satisfied if y = e mx is to be a solution. Since 
e mx can never be zero, it is thus necessary that 


am 2 + bm + c — 0 

This purely algebraic equation is known as the characteristic or 
auxiliary equation of either Eq. (1) or Eq. (2). In practice it is 
obtained not by substituting y - e mx into the given differential 
equation and then simplifying, but rather by substituting m % for 
y", m for y', and 1 for y in the given equation, or, still more simply, 
by equating to zero the operational coefficient of y and then 
letting D play the role of m: 

+ bD + c = 0 

The characteristic equation is a simple quadratic which will 
in general be satisfied by two values of m : 

—6 ± y/b 2 — 4oc 

m „ 

2 a 

Using these values, say mi and m 2 , two solutions 
yi = e”'* and y 2 = e”** 


can be constructed. From this pair, according to Theorem 1, 
Sec. 2.1, an infinite family of solutions 


y - ciyi + c 2 ?/2 = cie»>* + cte” 1 ** 

can be formed. Moreover, by Theorem 2, Sec. 2.1, if the Wronskian 
of these solutions is different from zero, then (4) is a complete 
solution of Eq. (2) ; i.e., it contains all possible solutions of the 
homogeneous equation. Accordingly, we compute 

W(y h y t ) = yij/i ~ yiy[ ~ e m ' x (rn 2 e m ’ x ) — e^ x (m x e m ^) 

— (m 2 — «i)e (m i +,n > >1 

Since can never vanish, it is clear that a complete solution 

of Eq. (2) is always given by (4), except in the special case when 
nil = m 2 and the Wronskian vanishes identically. 


EXAMPLE I 

Find a complete solution of the differential equation y " + 7y' + 1 2y = 0. 
The characteristic equation in this case is 
m 2 -I- 7m + 12 = 0 

and its roots are 

mi = —3 and mu — —4 
Since these values of m are different, a complete solution is 
y = c t e~ 3x + Cie~ ix 


EXAMPLE 2 

Find a complete solution of the equation y" + 2 y' + By «= 0. 
The characteristic equation in this case is 
m 2 + 2m + 5 » 0 


SEC. 2.2 


THE HOMOGENEOUS LINEAR EQUATION WITH CONSTANT COEFFICIENTS 


39 


and its roots are 

mi = — 1 + 2i and m s = — 1 — 2i 
Since these are distinct, a complete solution is 
y - + c 2 et- 1_M >* 

Although the last expression is undeniably a complete solu- 
tion of the given equation, it is unsatisfactory for many practical 
purposes because it involves imaginary exponentials, which are 
awkward to handle and are not tabulated. It is, therefore, a 
matter of considerable importance to construct a more convenient 
complete solution in the case in which m i and m 2 are conjugate 
complex quantities. 

To do this, let us suppose that 
mi *= p + iq and m 2 = p — iq 
so that a complete solution as first constructed is 
y = Cie (p+iq)x + ae^~ i<l)x 
By factoring out e px , this can be written as 
y — eP x (cie iqx + c 2 e~ i<IX ) 

Now the expression in parentheses can be simplified by using the 
Euler formulas (Sec. 14.7) 

= cos 6 + i sin 6 

e -i$ _ cos o _ ^ s i n g 

taking 6 — qx. The result of these substitutions is 
y == cJ ,r [ci(cos qx + i sin qx) + c 2 (cos qx — i sin qx)} 

— -f c 2 ) cos qx + i(ci — c 2 ) sin ?«] 

If we now define two new arbitrary constants by the equations 
A — Ci + c 2 and B = i(ci — c 2 ) 

the complete solution can finally be put in the purely real form 
y — e px (A cos qx + B sin qx) 

Of course, it is not difficult to verify directly that both 
y i = e px cos qx and y 2 — e px sin qx 

are particular solutions of the homogeneous equation (2). For a 
completely satisfactory derivation this should now be done, since 
we do not yet know that our formal treatment of complex 
exponentials, as though they obeyed the same laws as real 
exponentials, is justified. 

EXAMPLE 2 (continued) 

Applying the preceding reasoning to Example 2, it is clear that p — —1 and g ~ 2. Hence, the 
complete solution can be written 


y — e~ x {A cos 2x + B sin 2z) 



40 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


When the characteristic equation has equal roots, the two 
independent solutions normally arising from the substitution of 
y = e mx become identical, and, as we pointed out above, we do 
not have an adequate basis for constructing a complete solution. 
To find a second, independent solution in this case we use the 
method developed in the last section. 

Let the differential equation in question be 
y" - 2ry' + r*y = 0 
so that its characteristic equation 
m 2 — 2m + r 2 — 0 

has the repeated root m\ — r. Then yi = e rx is one solution, and, 
from Eq. (5), Sec. 2.1, a second independent solution is given by 

/ e ~fP(x)dx r qItx 

— — dx — e rx J jgZy, dx = xe rx ss xe m ' x 

Thus, in the exceptional case in which the characteristic equation has 
equal roots, a complete solution of Eq. (2) is 

y = cie m ‘* + dxe m i x 
EXAMPLE 3 

Find a complete solution of the equation (D 2 -j- 6Z> + 9 )y — 0. 

In this case the characteristic equation 

ot 2 + + 9 = 0 

is a perfect square, with roots mi =* m 2 - —3. Hence, by our last remark, a complete solution 
of the given equation is 

y ax cie~** + dxe~ 3x 

The complete process for solving the homogeneous equation 
(2) in all possible cases is summarized in Table 2.1. 


table 2,1 


Differential equation ay” + by' + cy = 0 or (aD 2 -f- bD + c)y = 0 
Characteristic equation am 2 + Iwi -f c = 0 or aD 2 + bD + c = 0 


Nature of the roots of 
the characteristic 
equation 

Condition on the 
coefficients of the 
characteristic equation 

Complete solution of the 
differential equation 

Real and unequal 
mi 7Z mi 

6* — 4oc > 0 

y = ce-nt* -(- c s e m * x 

Real and equal 
m 1 * m a 

6* — 4ac = 0 

y «= cie m »* + cixe m ' x 

Conjugate complex 

mi =* p + iq 

; 

b 1 — 4ac <0 

y — e px (A cos qx + B sin qx) 

m s = p — iq 




SEC. 2.2 


THE HOMOGENEOUS LINEAR EQUATION WITH CONSTANT COEFFICIENTS 41 


In particular applications, the two arbitrary constants in the 
complete solution must usually be determined to fit initial condi- 
tions on y and y', or their equivalent. The following examples will 
clarify the procedure. 


Find the solution of the equation y" — 4?/ + 4y = 0 for which y ~ 3 and y' — 4 when t — 0. 
The characteristic equation of the differential equation is 

m 2 — 4 m +4 = 0 

Its roots are mi = m<< = 2; hence, a complete solution is 
y - Cie Si + Cite 21 
By differentiating this, we find 

y' = (2ci + Ci)e 2 ‘ + 2cite 2t 

Substituting the given data into the equations for y and y', respectively, we have 
3 = Cj 4 = 2ci + Cs 

Hence, C\ = 3, c« = —2; and the required solution is 
y = Ze 2t - 2 te 2t 

EXAMPLE 5 

Find the solution of the equation (4jD a + 16D + 17 )y — 0 for which y = l when 4 = 0 and 
y = 0 when t = ir, 

In this case the characteristic equation is 
4m 8 + 16m + 17 = 0 

and from its roots, m = —2 + % i, we obtain the complete solution 


3-21 ( A cos i + B sin g) 


Substituting the given conditions into this equation, we find 
1 = A and 0 - e~ 2 *B or B 
Hence, the required solution is 


V = 


t 


EXERCISES 

1 What is the difference between Dy and yDl 

2 Verify that 

(D + 1 )(D 2 + 2) sin 3x = (D 2 + 2 )(£> + 1) sin 3a; = (Z> 3 + D 2 + 2D + 2) sin 3x 

3 Is ( D + x)(D + 2x)e x = (Z> + 2a;) (D + x)e x ? Explain. 

4 What meaning, if any, do you think can be assigned to D° ? Z) _1 ? D~ 2 ? 

Find a complete solution of each of the following equations: 

6 y» + y' — 2y — 0 6 by" + by' + y = 0 

7 y" — by - 0 By” — by' - 0 

9 (41)* + 4H + 1 )y = 0 10 (9 D 2 - 12 D + 4)^ = 0 

11 10?/" + by' + y = 0 12 y” + 10y' + 26 y = 0 



42 


LINEAR differential equations with constant coefficients 


CHAP. 2 


Find the solution of each of the following equations which satisfies the given conditions: 


.18 

y" + By' - 4y = 0 

y = 4, y' — —2 when x — 0 

14 

y" -+• Ay = 0 

y = 2, y' = 6 when x — 0 

16 

y" — 4?/ = 0 

V = 1, y' - —1 when x = 0 

16 

25y" + 20 y' + 4» = 0 

y = y f = 0 when x = 0 

17 

(D 2 + 62) + % - 0 

V — 0, y' = 3 when x ~ 0 

IS 

(D* + 2D + 5)y - 0 

y = 1 when x = 0, y' — 0 when x — ir 

19 

(D> + 2D + 5)t> - 0 

y - 1 when x = 0, y - 0 when x - ir 

20 

a Verify by direct substitution that y 1 = e TX cos qx and y t ~ e vx sin qx are solutions of 
the equation 


y" - 2 py' 

+ (p s + q*)v = 0 


b Verify that these solutions have a nonvanishing Wronskian. 

21 Show that, for k 9 * 0, both y = A cos (kx -f B) and y ~G sin (kx + H) are complete 
solutions of the equation y" + khj = 0. 

22 Show that, for k 0, y = A cosh kx + B sinh kx is a complete solution of the equation 
y" — h*y = 0. 

28 Show that y - e vx (A cosh qx + B sinh qx) is a complete solution of the equation ay" + 
by' + cy = 0, when the roots of the characteristic equation are p ± q, q 0. 

24 If the roots of its characteristic equation are real, show that no nontrivial solution of the 
equation ay" ■+• by 1 + cy => 0 can have more than one real zero. 

25 If the characteristic equation of the differential equation ay" + by' + cy = 0 has distinct 
roots m 1 and m 2 , show that 

e m i x — e m " x 

y „ 

mi — m2 

is a particular solution of the equation. Can we take the limit of this expression as ma —» mi 
and obtain a second solution of the differential equation when its characteristic equation 
has equal roots? 


2.3 

The nonhomogerteous equation 

In the last section we learned how to solve the homogeneous equa- 
tion ay" -f by ' + cy = 0, and with this knowledge we can now 
obtain the complementary function of the nonhomogeneous 
equation 

(1) ay" + by' -f - cy = f(x) 

However, we must also have a particular integral, i.e., a particular 
solution, of Eq. (1), before we can construct its complete solution, 
namely, 

y = complementary function + particular integral 

Various procedures are available for the determination of par- 
ticular solutions of Eq. (1), some applicable no matter what f(x) 
may be, others useful only when f(x) belongs to some suitably 
specialized class of functions. It should be borne in mind, how- 


sC. 2.3 


THE NONHOMOGENEOUS EQUATION 


43 


ever, that, in applying Theorem 3, Sec. 2.1, the important thing 
is not how we obtain a particular solution of Eq. (1) but merely 
that we have one such solution. Any method, from outright 
guessing to the most sophisticated theoretical technique, is 
legitimate, provided that it leads to a solution that can be checked 
in (1). In this section we shall introduce the so-called method of 
undetermined coefficients, which appears initially to be based on 
little more than guesswork, but which is readily formalized into 
a well-defined procedure applicable to a well-defined and very 
important class of cases. 

To illustrate the method, suppose that we wish to find a 
particular integral of the equation 

>) y" + 4y' -}-3 y - 5e 2x 

Since differentiating an exponential of the form e kz merely repro- 
duces the function with, at most, a change in its numerical 
coefficient, it is natural to “guess” that it may be possible to 
determine A so that 

Y = Ae** 

will be a solution of (2). To check this, we substitute Y — Ae* x 
into the given equation, getting 

4 Ae 2 * 4- 8 Ae 2 * 4- 3 Ae 2 * l 5e 2x or 15 Ae 2x 1 be 2 * 

which will be an identity if and only if A — H- Thus, the required 
particular integral is 

Y = Me** 

Now suppose that the right-hand member of (2) had been 
5 sin 2x. Guided by our previous success we might perhaps be led 
to try 

Y = A sin 2a: 

as a particular integral. Substituting this to check whether or not 
it can be a solution, we obtain 

— 4 A sin 2a: -f- 8 A cos 2 a: 4~ 3 A sin 2x A 5 sin 2;c 
— A sin 2a: 4- SA cos 2a; =L 5 sin 2a; 

and this cannot be an identity unless, simultaneously, A = — 5 
and A — 0, which is absurd. The difficulty here, of course, is that 
differentiating sin 2a: introduced the new function cos 2a;, which 
must also be eliminated identically from the equation resulting 
from the substitution of Y — A sin 2.t. Since the one arbitrary 
constant A cannot satisfy two independent conditions, it is clear 
that we must arrange to incorporate two arbitrary constants in 
our tentative choice for Y. This is easily done by assuming 


Y = A sin 2a: 4~ B cos 2a: 


44 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


which contains the necessary second parameter, yet cannot intro- 
duce any further new functions, since it already is a linear com- 
bination of all the independent terms that can be obtained from 
sin 2x by repeated differentiation. The actual determination of 
4 and B is a simple matter, for substitution into the given differ- 
ential equation yields 

(-44 sin 2x - 4 B cos 2x) + 4(24 cos 2x - 2 B sin 2x) 

+ 3(4 sin 2x + B cos 2*) - 5 sin 2x 
(—4 — SB) sin 2x -|- (84 — B ) cos 2x — 5 sin 2x 
and for this to be an identity requires that 
-4 - SB = 5 and 84 - B = 0 

from which we find immediately that 4 = an d B = —${3. 

Hence, finally, 

v sin 2x + 8 cos 2x 
Y “ 13 

With these illustrations in mind we are now in a position to 
describe more precisely the use of the method of undetermined 
coefficients for finding particular integrals: 

rule 1 If f(x) is a function for which repeated differentiation yields only 
a finite number of linearly independent expressions, then, in 
general, a particular integral F for the nonhomogeneous equation 
ay" + by' + cy = f(x) can be found by 

a Assuming F to be an arbitrary linear combination of all the 
linearly independent terms which arise from f(x) by repeated 
differentiation 

b Substituting F into the given differential equation 
c Determining the arbitrary constants in F in such a way that 
the resulting equation is identically satisfied. 

The class of functions f(x) possessing only a finite number of 
linearly independent derivatives consists of the simple functions 

k 

x n (n a positive integer) 
e kx 

cos kx 
sin kx 

and any others obtainable from these by a finite number of addi- 
tions, subtractions, and multiplications. If /(a:) possesses infinitely 
many independent derivatives, as is the case, for instance, with 
the simple function 1 /x, it is occasionally convenient to assume 
for F an infinite series whose terms are the respective derivatives 
of f(x) each multiplied by an arbitrary constant. However, the 
use of the method of undetermined coefficients in such cases 


SEC. 2.3 


THE NONHOMOGENEOUS EQUATION 


45 


(3) 


(4) 


involves questions of convergence which never arise when f(x) has 
only a finite number of independent derivatives. 

There is one important exception to the procedure we have 
just been outlining, which we must now investigate. Suppose, for 
example, that we wish to find a particular integral for the equation 
y" + 4 y' + Sy = 5e~ 3x 

Proceeding in the way we have just described, we would start 
with 

Y = Ae~ 3x 

getting 9 Ae~ 3x — 12 Ae~ 3x + 3 Ae~ 3x = 5e~ 3x 

0 L 5e~ 3x (!) 

This is obviously an impossibility, and it is important that we be 
able to recognize and handle such cases. The source of the diffi- 
culty is easily identified. For the characteristic equation of Eq. (3) 
is 


m 2 -j- 4m + 3 = 0 


and, since its roots are mi = — 3, m 2 = —1, the complementary 
function of Eq. (3) is 

y = cj.e~ 3x + ctfr x 

Thus, the term on the right-hand side of (3) is proportional to a 
term in the complementary function; that is, it is a solution of 
the related homogeneous equation and, hence, can yield only 0 
when it is substituted into the left member. 

One way in which we might attempt to avoid this difficulty 
would be to find a particular integral of the equation 

y" + Ay' + 3 y = 5e kx 


with k t* — 3, and then take the limit of this solution as k — > — 3. 
Thus, substituting Y = Ae kx , as usual, we have 


lc 2 Ae kx + 4 kAe kx + 3 Ae kx = 5e kx 

g 

whence, A = , „ - v, — r— = and 

k 2 + Ak + 3 


Y = 


5e kx 

k 2 + 4/c + 3 


Unfortunately, the limit of this as k — » —3 is infinite, and so we 
must look further. However, since Be~ 3x is a solution of the 
related homogeneous equation for all values of B, it follows, 
taking B = ~5/(/c 2 + 4 k+ 3), that 

—5e~ 3x 

Vl ~ k 2 + 41c + 3 

is a particular solution of the homogeneous equation and, hence, 
that 

Y v 5e kx — 5e~ 3x 
r Vl ~ k 2 + 4fc + 3 


46 


Lf NEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


is another particular integral of the nonhomogeneous equation 
(4). Now as k -* - 3 [and Eq. (4) approaches the given equation 
(3)], this function becomes an indeterminate of the form 0/0. 
Evaluating it by L’ Hospital’s rule, we find that the limit is 

5xe~ Sx 

-2 

and by direct substitution it is easily verified that this is actually 
a solution of Eq. (3). 

It is not necessary to go through this limiting process in 
particular cases where f(x) is proportional to a term already in the 
complementary function, for we have the following extension of 
Rule 1 : 

rule 2 If any term in the expression Y normally used to find a particular 
integral of the nonhomogeneous equation ay" + by' + cy = f(x) 
duplicates a term in the complementary function, then before it 
is substituted into the equation, Y must be multiplied by the 
lowest positive integral power of x which will eliminate all such 
duplications. 

The results of our discussion are summarized in Table 2.2. 

table 2.2 


Differential equation: ay” + by' + cy — f(x) or (aD J + bD + c)y = f(x) 


m* 

Necessary choice for particular integral Ft 

1. a 

A 

2. ax n 

(n a positive integer) 

AoX n + ^4. 1 + . * • 4* An—lX + An 

3. ae rx 

( r either real or complex) 

Ae rx 

4. a cos kx% 

A cos kx + B sin kx 

5. <x sin kx 

6. ax n e n cos kx 

(Aox n + • ■ • + A n -ix + A n )e rx cos kx 
+ (B 0 x n + • * * + Bn-ix + B n )e rx sin kx 

7. ax n e TX sin kx 


* When /(*) consists of a sum of several terms, the appropriate choice for F 
is the sum of the Y expressions corresponding to these terms individually, 
t Whenever a term in any of the F’s listed in this column duplicates a term 
already in the complementary function, all terms in that F must be multi- 
plied by the lowest positive integral power of x sufficient to eliminate the 
duplication. 

t The hyperbolic functions cosh kx and sinh kx can be handled either by 
expressing them in terms of exponentials or by using formulas entirely 
analogous to those in lines 4, 5, 6, and 7. 


SEC. 2.3 


THE NONHOMOGENEOUS EQUATION 


47 


EXAMPLE 1 

Find a complete solution of the equation y" + 9 y — 2x 2 + 4x + 7. 

The characteristic equation in this case is 
m 2 + 9 = 0 

Since its roots are m — ± Si — 0 + 3 i, the complementary function is 
A cos 3x + B sin 3a; 

According to Table 2.2, the necessary trial solutions corresponding to the respective terms 
in the right member of the differential equation are 

Aox 2 + Aiz + Aa a 0 x + a i a 0 

However, the last two are clearly contained in the first, and no extra generality is achieved by 
including them. Hence, we assume simply 
Y = Aqx 2 + A\x + Ai 

Substituting this into the differential equation gives 

2 A a + 9(A 0 x 2 + Aix + Ai) - 2x 2 + 4x + 7 
Equating coefficients of x 2 , x, and the constant term a: 0 , we obtain the three equations 
9Ao - 2 9Ai = 4 2 Ao + 9A 2 = 7 
Hence, Ao = % * % At - 5 %i 

and so a complete solution is 


?/ 


A cos 3a: + B sin 3a: + 


18a: 2 + 36a; + 59 
81 


EXAMPLE 2 

Find a complete solution of the equation y" + 4 y' + 5y = 3e~ 2 *. 

In this case the characteristic equation is 
m 2 + 4m + 5 = 0 

and its characteristic roots are = —2 ± i. Hence, the complementary function is 

e~ u (A cos x + B sin x) 

For the trial solution F normally corresponding to the term 3e -21 , we have Y = Ce~ 2x . More- 
over, although each term of the complementary function contains e~ 2x as a factor, e~ 2x does 
not itself occur as a term in the complementary function; therefore, it is unnecessary and in fact 
incorrect to modify Y by multiplying it by any power of x. Hence, we substitute F = Ce~ 2x into 
the given equation, getting 

4 Ce~ 2x - 8Ce~ 2x + 5t7e- 2 * = 3e~ 2 * or Ce~ 2x = 3e- 2 * 

Thus, C = 3; the particular integral is F = 3e -21 ; and a complete solution is 
y = e ~ 2x (A cos x + B sin x) + 3e~ 2x 


EXAMPLE 3 

Find a complete solution of the equation y" + by' + 6 y = 3e~ 2x + e 3x . 
The roots of the characteristic equation 

vi 2 + 5m + 6 = 0 

are toj = —2, m 2 = —3. Hence, the complementary function is 
Cie~ 2x + cte~ 3x 


52 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


Find a complete solution of each of the following equations: 

9 y" + 2 ay' + (« s - b*)v = f(x) 10 y" + 2 ay' + (a 2 + b*)y = f(x) 

11 y" + 2ay' + ahj = /(s) 

1.2 Using the method of variation of parameters, find a particular integral of the equation 
y " _ y » l/*. How does this result compare with the result of Exercise 20, Sec. 2.3? 


2.5 

Equations of higher order 

The theory of the linear differential equation of order higher 
than 2, 

(1) y<»> + Pi(a:)2/< n -» + * * * + P n -i(x)y f + P„(.r)y - P(s) 

parallels the second-order case in all significant details. In par- 
ticular, with the obvious changes required by the .fact that 
n > 2, the three fundamental theorems of Sec. 2.1 hold for linear 
equations of all orders.* For the especially important case of the 
homogeneous, linear, constant-coefficient equation of order higher 
than 2, 

(2) o 0 y Cn) + aiy (n-1) + • • • + a n -iy' + a n y = 0 

the substitution y — e mx leads, as before, to the characteristic 
equation 

(3) a 0 m n + di'm n ~ l +.••■•+ a n -im + a n = 0 

which can be obtained in a specific problem simply by replacing 
each derivative by the corresponding power of m. The degree of 
this algebraic equation will be the same as the order of the differ- 
ential equation (2); hence, counting repeated roots the appro- 
priate number of times, we find that the number of roots 
mi, m 2 , . . . will equal the order of the differential equation. 
From these roots, the solution of the homogeneous equation can 
be constructed by adding together the terms that were listed in 
Table 2.1, Sec. 2.2, as corresponding to each of the various root 
types. The only extension necessary is required when the charac- 
teristic equation (3) has roots of multiplicity greater than 2: If 
Vi is the solution normally corresponding to a root mi, and if this 
root occurs Jc (>2 ) times, then not only are y x and xy x solutions (as 


* Before Theorem 2, Sec. 2.1, can be extended to equations of higher order, 
it is necessary that the Wronskian of more than two functions be defined. 
The appropriate generalization is 


W(yi,y 3 , . . . ,y n ) 


Vl Vi • ■ • Vn 

v'l Vi ■ * ■ Vn 


Vi * y n in ~ n 


which clearly reduces to the definition of Sec. 2.1 if n — 2. 



SEC. 2.5 


EQUATIONS OF HIGHER ORDER 


53 


in the second-order case), but x 2 y h xhj\, . . . , are also solu- 

tions and must be included in the complementary function. 

For the nonhomogeneous, constant-coefficient equation 
(4) a 0 y M + a x if n -v + • • • + a n -iy r + a n y = R(x) 

it is still true that the complete solution is the sum of the comple- 
mentary function, obtained by solving the associated homogene- 
ous equation, and a particular integral. In the important case 
when R (x) is a function possessing only a finite number of 
independent derivatives the particular integral can be found just 
as before by using the tentative choices for Y listed in Table 2.2, 
Sec. 2.3. Variation of parameters can be extended to those prob- 
lems which the method of undetermined coefficients cannot 
handle. An example or two will make these ideas clear. 


EXAMPLE 1 

Find a complete solution of the equation y"' + 5 y" + %' + 5 y = 3e 2x . 

The characteristic equation in this case is 
m 3 + 5 ni 5 + 9m + 5 - 0 

By inspection* m — —1 is seen to be a root. Hence, cie~* must be one term in the comple- 
mentary function. When the factor corresponding to this root is divided out of the characteristic 
equation, there remains the quadratic equation 
+ 4ro + 5 = 0 

Its roots are m = — 2 ± i; thus, the complementary function must also contain 
e~ 2x (a cos x + c# sin a:) 

The entire complementary function is, therefore, 

Cie~ s + <t 2 x (c 2 cos x + C 3 sin a:) 

For a particular integral we try, as usual, F = Ae ix . Substituting this into the differential 
equation gives 

(8.4e 2x ) -f- 5(4 Ae**) + 9(2 Ae**) + 5 (Ae**) = 3e 2s or 51 Ae 2x = 3c 2 * 



and, therefore, y = ae~ x -(- e 2x (ca cos x + C 3 sin x) + ~ 


EXAMPLE 2 

Find a complete solution of the equation (£> 4 + 8 D 2 + 16 )y — — sin x. 

The characteristic equation here is 

m 4 + 8m 2 + 16 = 0 or (m 2 + 4) 2 = 0 

The roots of this equation are m = +2 i, ±2 i. Hence, the complementary function contains 
not only the terms 

cos 2x and sin 2x 


* In general, the most difficult feature of the solution ox a linear, constant- 
coefficient differential equation of order higher than 2 is the determination 
of the roots of the characteristic equation. One useful procedure for doing 

this, Graeffe’s root-squaring process, is discussed in the Appendix. 



54 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


but also these terms multiplied by * and is, therefore, 

ci cos 2x -f c 2 sin 2s + c»x cos 2x + ax sin 2x 

To find a particular integral we try Y — A cos x + B sin x, which, on substitution into 
the differential equation, gives 

(.4 cos x B sin x) -T 8 (— A cos x — B sin a;) + 16 (j4. cos x + B sin x) = — sin x 
OV QA cos a; + 9B sin x = - sin x 

This will be an identity if and only if A = Of and B = Therefore, 

Y = sina: 

9 

and the complete solution is 

sin x 

y = ci cos 2x + ci sin 2x + Czx cos 2x + ax sm 2x ~ 

EXAMPLE 3 

Find the solution of the equation (D 4 + 3D* + 3D* + D)y *■ 2x + 8 for which y = y' — y" - 
y'" = 0 when x = 0 

The characteristic equation in this case is 

m i + 3w 3 + 3 m® + m = 0 or m(m + 1)* = 0 

Its roots are m = 0, —1, —1, —1; hence, the complementary function, taking due account of 
the triple root, is 

a + be~* -f cxe~ x + dx 2 e~ x 

To find a particular integral we would ordinarily assume 
Y * Ax + B 

However, one term in this expression (the constant B ) duplicates a term already in the comple- 
mentary function (the constant a). Hence, we must multiply the original choice for Y by x 
before using it. 

Substituting the modified expression Y - Ax * + Bx into the differential equation, we find 
0 + 3(0) + 3(2 A) + (2 Ax + B) = 2s + 8 
or 2 Ax + (6d + B) - 2s + 8 

For this to be identically true requires that A = 1 and B — 2. Hence, Y = s* + 2s, and the 
complete solution is 

(5) y ~ a + be~ x + cxe~* + dx*e~ x + s* + 2s 

In order to impose the given initial conditions, it is necessary that we have expressions for 
y', y", and y'" as well as for y. Hence we differentiate, getting 

(6) y' = —be~ x + c(e~ x — se~*) + d(2xe~ x — s*e~*) + 2s + 2 

(7) y" = be~ x -f c(—2e~ x + xe~ x ) + d(2e~ x — 4xe~ x -f xH~ x ) + 2 

(8) y'" - —be~ x -f c(3e~* — xe~ x ) + d(—6e~ x + 6xe~ x — x l e~ x ) 


t Since the differential equation contains only derivatives of even order, we could have 
foreseen that Y would contain only a sine term and that Y - B sin s would be a satisfactory 
initial “guess.” 



SEC. 2,6 


APPLICATIONS 


55 


Substituting the given conditions into Eqs. (5), (6), (7), and (8), we find 
0 = a + b 


0 - -b+ c +2 

0= 6 - 2c + 2d + 2 

0 = - b + 3c - 6d 

Solving these simultaneously for a, b, c, and d gives 

a = -12 b = 12 c = 10 d = 3 

and, finally, 


y = -12 + I2e~ x + 10xe~ x + 3a:V-* -f x* -f 2x 


EXERCISES 

Find a complete solution of each of the following equations: 


1 (D 3 + 6D 2 + 11D + 6)y = 6a; - 7 

3 y'" — 2 y" — 3 y' + lOy — 40 cos x 

6 (D 4 + 8 D 2 — 9 )y = a 2 + sin 2a: 

7 (D 4 — D)y = x 2 


2 (D 4 - 16)y = c* 

4 yiv + 10y" + 9y — cos 2a: 

6 (2>» + D* + 3D — 5)y = e* 
8 (D® - 64) y - 16 sin 2a: 


Find that solution of each of the following equations which satisfies the given conditions: 


9 (D® + 2D S — D — 2)y = sin a: y =» y' = y " == 0 when x = 0 

10 (D 4 - 2D® + 2D a - 2D + l)y = e~ x y = y' - y" » y"' = 0 when a; - 0 

11 (D® — 2D 2 + D — 2 )y = 0 V — li' ~ y" — 1 when x = 0 

12 For what values of X, if any, does the equation y ,r — \*y — 0 have a nontrivial solution 
satisfying the conditions y — y‘ — 0 when x = 0 and y" = y'" = 0 when a: = 1 ? (Hint : 
The work is easier if a complete solution containing trigonometric and hyperbolic functions 
is used instead of one containing trigonometric and exponential functions.) 

13 Using the method of variation of parameters, obtain a formula for a particular integral 
of the equation 


(D 3 - 6D S + 11D - 6)y - f(x) 

14 If three functions are linearly dependent, prove that their Wronskian is identically zero. 
16 Prove that the Wronskian of the functions e m »*, e m * x , and e™’* is different from zero if and 
only if mi, m 2 , and m 3 are all different. 


Applications Linear differential equations with constant coefficients find their 
most important application in the study of electrical circuits and 
vibrating mechanical systems. So useful to engineers are the 
results of this analysis that we shall devote an entire chapter to 
its major features. However, there are also other applications of 
considerable interest, and, although we cannot discuss them at 
length, we shall conclude this chapter with a few typical examples. 

One important field in which linear differential equations 
often arise is the study of the bending of beams. When a beam is 
bent it is obvious that the fibers near the concave surface of the 


56 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


FIGURE 2.1 
A beam before 
and after 
bending. 



beam are compressed whereas those near the convex surface are 
stretched. Somewhere between these regions of compression and 
tension there must, from considerations of continuity, be a sur- 
face of fibers which are neither compressed nor stretched. This is 
known as the neutral surface of the beam, and the curve of any 
particular fiber in this surface is known as the elastic curve or 
deflection curve of the beam. The line in which the neutral surface 
is cut by any plane cross section of the beam is known as the 
neutral axis of that cross section (Fig. 2.1). 

The loads which cause a beam to bend may be of two sorts : 
They may be concentrated at one or more points along the beam, 
or they may be continuously distributed with a density w{x) 
known as the load per unit length. In either case we have two 
important related quantities. One is the shear V (x) at any point 
along the beam, which is defined as the algebraic sum of all the 
transverse forces which act on the beam on the positive side of the 
point in question (Fig. 2.2). The other is the moment Mix), which 
is defined as the total moment produced at a general, point along 
the beam by all the forces, transverse or not, which act on the 
beam on one side or the other of the point in question. We shall 
consider the load per unit length and the shear to be positive if 
they act in the direction of the negative jy-axis. The moment we 
shall take to be positive if it acts to bend the beam so that it is 
concave toward the positive y- axis. With these conventions of 
sign (which are not universally adopted) it is shown in the study 
of strength of materials that the deflection of the beam y(x) 


FIGURE 2.2 
Plot showing the 
conventions for 
the signs of the 
moment, shear, 
and load per unit 
length at a 
general point 
of a beam. 





SEC. 2.6 


APPLICATIONS 


57 


satisfies the second-order differential equation 
Ely" = M 

where E is the modulus of elasticity of the material of the beam, 
and I , which may be a function of x, is the moment of inertia of 
the cross-section area of the beam about the neutral axis. If the 
beam, bears only transverse loads, it can be shown further that 
we have the two additional relations 
dM = d(EIy") = v 
dx dx 

dm ■ dV dKEIy") 

dx 2 dx dx* W 

In most elementary applications the moment M is an explicit 
function of x ; hence, Eq. (1) can be solved and the deflection y(x) 
determined simply by performing two integrations. However, in 
problems in which the load has a component in the direction of 
the length of the beam, M depends on y, and Eq. (1) can be 
solved only through the use of techniques from the field of differ- 
ential equations. An interesting example of this sort is provided 
by the classic problem of the buckling of a slender column. 

EXAMPLE 1 

A long, slender column of length L and uniform cross section whose ends are constrained to 
remain in the same vertical line but are otherwise free (i.e., are able to turn) is compressed by a 
load F. Determine the possible deflection curves of the column and the loads required to produce 
each one. 

Let coordinates be chosen as shown in Fig. 2.3. Then, clearly, the moment arm of the load F 
about a general point P on the deflection curve of the beam is y; hence, Eq. (1) becomes 
(4) Ely " = -Fy 

the minus sign indicating that, when y is positive (as shown), the moment is negative, since it 
has produced a deflection curve which is convex toward the positive y-axis. 

By hypothesis, the column is of uniform cross section; hence, the moment of inertia I is a 
constant. Therefore, (4) is a constant-coefficient differential equation and can be solved by the 


FIGURE 2.3 
A slender column 
buckling under 
a vertical load. 



(1) 


(2) 

( 3 ) 



58 


UNEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


methods of Sec. 2.2. Accordingly, we set up the characteristic equation 
Elm 3 + F «. 0 


and solve it, getting 



Hence, a complete solution of (4) is 



To determine the constants A and B, we have the information that y = 0 when x =» 0 and 
also when x = L. Substituting the first of these into Eq. (5), we see at once that A = 0. Sub- 
stituting the second, we obtain the equation 



Since sin -\/ F /El L is in general not equal to zero, it follows that B = 0, which, since we have 
already found A == 0, means that y ^ 0. However, if the load F has just the right value to make 
s/F/El L = j nr, then the last equation will be satisfied without B being 0, and equilibrium is 
then possible in a deflected position defined by 


y - B sin - 


Since n can take on any of the values 1, 2, 3, ... , there are thus infinitely many different 
critical loads 

each with its own particular deflection curve. For values of F below the lowest critical load, the 
column will remain in its undeflected vertical position or, if displaced slightly from it, will return 
to it as an equilibrium configuration. For values of F above the lowest critical load and different 
from the higher critical loads, the column can theoretically remain in a vertical position, but 
the equilibrium is unstable, and, if the column is deflected slightly, it will not return to a vertical 
position but will continue to deflect until it collapses. Thus only the lowest critical load is of 
much practical signiflcance. 


In many physical systems vibratory motion is possible but 
undesirable. In such cases it is important to know the frequency 
at which vibration could take plage in order that periodic external 
influences that might be in resonance with the natural frequency 
of the system can be avoided. For simple linear systems in which 
(as is usually the case) friction is neglected, the underlying differ- 
ential equation is eventually reducible to the form 

y" + <o 2 y = 0 

Since the complete solution of this equation is 
y — A cos wt + B sin ut 



SEC. 2.6 


APPLICATIONS 


59 


and since both cos cd and sin wt represent periodic behavior of 
frequency 

co rad/unit time or — cycles/unit time 

it is clear that the frequency can be read just as well from the 
differential equation itself as from any of its solutions, general or 
particular. The important part of such a frequency calculation, 
then, is the formulation of the differential equation and not its 
solution. 


EXAMPLE 2 

A weight W 2 is suspended from a pulley of weight W 1 , as shown in Fig. 2.4. Constraints, which 
need not be specified, prevent any swinging of the system and permit it to move only in the 
vertical direction. If a spring of modulus k, that is, a spring requiring k units of force to stretch 
it one unit of length, is inserted in the otherwise inextensible cable which supports the pulley, 
find the frequency with which the system will vibrate in the vertical direction if it is displaced 
slightly from its equilibrium position. Friction between the cable and the pulley prevents any 
slippage, but all other frictional effects are to be neglected. 

As coordinate to describe the system we choose the vertical displacement y of the center 
of the pulley, the downward direction being taken as positive. Now when the center of the 
pulley moves a distance y, the length of the spring must change by 2 y. Moreover, as this happens, 
the pulley must rotate through an angle 


0 «■ J and 


dO 

dt 


JL dy 
R dt 


It will be convenient to formulate the differential equation governing this problem through 
the use of the so-called energy method. From the fundamental law of the conservation of energy, 
it follows that if no energy is lost through friction or other irreversible changes, then in a mechanical 
system the sum of the instantaneous potential and kinetic energies must remain constant. In the 
present problem the potential energy consists of two parts: (a) the potential energy of the 
weights Wi and Wi due to their position in the gravitational field and (6) the potential energy 
stored in the stretched spring. Taking the equilibrium position of the system as the reference 
level for potential energy, we have for (a) 


(6) (PE)a = —(Wi + W-i)y 

the minus sign indicating that a positive y corresponds to a lowering of the weights and, hence, 
a decrease in the potential energy. The potential energy stored in the spring is simply the amount 
of work required to stretch the spring from its equilibrium elongation, say 8, to its instantaneous 



FIGURE 2.4 

An unusual spring- 
suspended weight 
in equilibrium and 
after a vertical 
displacement. 


60 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS CHAP. 2 


(9) 


i I Wi R*/y 

(KE).- -/(«)■--- 


- («)' 


elongation S + 2 y. Since the force in the spring at any time is 

F - elongation X force per unit elongation = sk 
we have for the potential energy of type b 

( 7 ) (PE)?, = f**F ds = f* +2]/ ksds = | d + * =* 2 ky* + 27c«j/ 

The kinetic energy likewise consists of two parts: (a) the energy of translation of the 
weights Wi and Wi, namely, 

1 w. + f , 

(8) (KE)a =■ - (|/)*t 

and (6) the energy of rotation of the pulley, namely, 

' * _ Wi 

«/ 

The conservation of energy now requires that 

Kinetic energy + potential energy - constant 
or, substituting from Eqs. (6), (7), (8), and (9), 

~ (?/) s + (|/) J + W + 2k9y) - (W\ + W 3 )y - C 

Ag Ag 

Differentiating this with respect to time, we have 

-f ^L±JL* yij + Akyy + 2 kSy - ( W s + W*)y - 0 

or, dividing out y (which surely cannot be identically zero when the system is in motion) and 
collecting terms, 

opv t of, 

-g + Aky - (FFj + TT S ) - 2&5 - 0 

the terms on the right equaling zero since the elongation S of the spring in its equilibrium 
position is 

, Wi + W t 
2k 

The differential equation describing the vertical movement of the system is, therefore, 

8 kg 

9 + 3JF, +2W t V " ° 

Prom this, as we pointed out above, we can immediately read the natural frequency of the 
system, namely, 


1 / 8kg , . . 

£ ywTTW* cyclea/unlt time 


In general, differential equations with variable coefficients 
are very difficult to solve and rarely can be solved in terms of 
elementary functions. However, there is one important linear 
differential equation with variable coefficients which can always 
be reduced by a suitable substitution to a linear equation with 


f In problems in dynamics, first and second derivatives with respect to time 
are often indicated by placing one and two dots, respectively, over the 
variable in question. 


SEC. 2.6 


APPLICATIONS 


61 


constant coefficients and hence solved without difficulty. This is 
the so-called equation of Euler* 

(10) a 0 .'c n ?/ (n) + aix n ~ l y {n ~ l) + ■ • • + a„-ixy' + a n y = 0 

in which the coefficient of each derivative is proportional to the 
corresponding power of the independent variable. If we change 
the independent variable from x to z by means of the substitution 
x = e z or z — In x 

Eq. (10) becomes an equation in y and z with constant coefficients 
which can then be solved by the methods of Sec. 2.5. Finally, 
replacing z by In re in the solution of the transformed equation we 
obtain the solution of the original differential equation. 


EXAMPLE 3 

Find a complete solution of the differential equation 


.• 25 ? + 4a! . *2 - 

' dx 3 dx 2 


5x% 

dx 


Under the transformation x — e* or z ■■ 


- 15y - 0 
In x we have 


dy 

dy dz 

dx 

dz dx 

d*y 

_±(u 

dx 8 

dx \x i 

dh, 

= d T 1 

dx 3 

dx L x 2 


. dy ■ 1 d 2 y dz 
8 dz x dz 2 dx 


_2 dy 
x 3 dz 


±dy 

x s dz 


I 

r* dz 8 


I d*y 
3 dz 8 ' 


JL ( IlM 

x 3 dz 3 


d 2 y d*y 
" dz 2 + dz 3 . 


i\ di 
7 dx 


Substituting these into the given differential equation, we have 

*'Wi - 3 S +2)] + <-[£(-2+5)] - *(;S) - ^ - « 


or, simplifying and collecting terms, 

d 3 V d 2 y ndy 
dz 3 dz 2 dz 


The characteristic equation of the last equation is 

m 3 + m 2 — 7m — 15 = (m — S)(m 8 + 4m + 5) = 0 
From its roots, m i — 3, m s = —2 + i, m 3 = —2 — i, we obtain the complete solution 
y = cic 3 * + e~ 2z (c 3 cos z + c 3 sin z) 

Finally, replacing z by In x, we have 

y — ae 31n * + c~ 8 ln *[c 2 cos (In x) + c 3 sin (In x)] 

- Cix 3 + ” [c 2 cos (In x) + Cs sin (In a;)] 


* Also called Cauchy’s equation, after the French mathematician Augustin 
Louis Cauchy (1789-1857). 


62 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


EXERCISES 

Find a complete solution of each of the following equations: 

1 x z y >» 4. 2 xhj" ~ xy' + y ~ 0 2 *V" - 3xV' + 7xp' - 8y = 0 

8 & .fl! +5s 4 + ! ,_ &+2 

dx a dx 

4 Since e* is always positive, does the use of the substitution x — e* mean that an Euler equa- 
tion can be solved only for positive values of a:? How can a solution be obtained which will 
be valid for negative values of x? 

6 If x — e' and if — m D. establish the operational equivalences: 
ds 



;c s — * D(D - 1 )(D - 2) 


Explain how these formulas can be used to shorten the work of solving an Euler equation. 

6 a Show that the substitution Ax 4- B — e‘, or 2 = In (Ax 4 B), will reduce the equation 

a (Ax + BY & 4 b(Ax + B)f + y~Q 
dx 1 dx 

to a linear equation with constant coefficients. Do you think that, for n > 2, the equation 

aAAx 4 B) n + <zi(dx 4 B) n ~' i - — - 4 ■ • • 4 an-iCAx 4 B) ~ 4 ««?/ = 0 
dx n dx" -1 dx 

can be solved in a similar fashion ? 
b Find a complete solution of the equation 

(*-2)^4 2(*r2).J - 6?/ - 0 
ax a dx 

7 A circular cylinder of radius r and height h, made of material weighing w lb/in. 3 , floats in 
water in such a way that its axis is always vertical. Neglecting all forces except gravity and 
the buoyant force of the water, as given by the principle of Archimedes, determine the 
period with which the cylinder will vibrate in the vertical direction if it is depressed slightly 
from its equilibrium position and released. 

8 A cylinder weighing 50 lb floats in water with its axis vertical. When depressed slightly 
and released, it vibrates with period 2 sec. Neglecting all frictional effects, find the diameter 
of the cylinder. 

9 A straight hollow tube rotates about its mid-point with constant angular velocity «, the 
rotation taking place in a horizontal plane. A pellet of mass m slides without friction in the 
interior of the tube. Find the equation of the radial motion of the pellet until it emerges 
from the tube, assuming that it starts from rest at a radial distance a from the mid-point 
of the tube. 

10 A straight hollow tube rotates about its mid-point with constant angular velocity < 0 , the 
rotation taking place in a vertical plane. Show that if the initial conditions are properly 
chosen, a pellet sliding without friction in the tube will never be ejected but will execute 
simple harmonic motion within the tube. 


SEC. 2.6 


APPLICATIONS 


11 Neglecting the effect of its own weight, show that the deflection of a uniform cantilever 
beam at the point x = x 0 due to a unit load at the point x — xi is equal to the deflection 
at x = xi due to a unit load at x = x 0 - 

12 A uniform cantilever beam of length L is subjected to an oblique tensile force F at the free 
end. Find the tip deflection as a function of the angle 0 between the direction of the force 
and the initial direction of the beam. 

13 A long, slender column of uniform cross section is built in rigidly at its base. Its upper end, 
which is free to move out of line, bears a vertical load F. Determine the possible deflection 
curves and the load required to produce each one. 

14 A uniform shaft of length L rotates about its axis with constant angular velocity w. The 
ends of the shaft are held in bearings which are free to swing out of line, as shown in Fig. 2.5, 


FIGURE 2.5 



if the shaft deflects from its neutral position. Show that there are infinitely many critical 
speeds at which the shaft can rotate in a deflected position, and find these speeds and the 
associated deflection curves. [Hint: During rotation, centrifugal force applies a load per 
unit length given by 

, . pAco 2 

w(x) y 

0 

where A is the cross-section area of the shaft and p is the density of the material of the 
shaft. Substitute this into Eq. (3), solve the resulting differential equation, and then impose 
the conditions that at :c = 0 and at x = L the deflection of the shaft and the moment are 
zero.] 

15 Work Exercise 14 if the bearings are fixed in position and cannot swing out of line.- 

16 A cantilever beam has the shape of a solid of revolution whose radius varies as s/ x, where x 
is the distance from the free end of the beam. A tensile force F is applied at the free end of 
the beam at an angle of 45° with the initial direction of the beam. Find the deflection curve 
of the beam. 

17 A weight W hangs by an inextensible cord from the circumference of a jmlley of radius R 
and moment of inertia I. The pulley is prevented from rotating freely by a spring of modu- 
lus k, attached as shown in Fig. 2.6. Considering only displacements so small that the 
departure of the spring from the horizontal can be neglected, and neglecting all friction, 
determine the natural frequency of the oscillations that occur when the system is slightly 
disturbed. (Hint: Use the energy method to obtain the differential equation of the system.) 



64 


LINEAR DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS 


CHAP. 2 


18 Under the assumption of very small motions and neglecting friction, determine the natural 
frequency of the system shown in Fig. 2.7 if the bar is of uniform cross section, absolutely 
rigid, and of weight w. 

f A ■* O 


m s 

19 Under the assumption of very small motions and neglecting friction, determine the natural 
frequency of the system shown in Fig. 2.8 if the bar is of uniform cross section, absolutely 
rigid, and of weight w. 


W 

-o 



20 A perfectly flexible cable of length 2 L, weighing to lb/ft, hangs over a Motionless peg of 
negligible diameter. At t - 0 the cable is released from rest in a position in which the portion 
hanging on one side is a ft longer than that on the other. Find the equation of motion of 
the cable as it slips over the peg. 

21 A perfectly flexible cable of length L and weighing w lb /ft lies in a straight line on a friction- 
less table top, a ft of the cable hanging over the edge. At t «« 0 the cable is released and 
begins to slide off the edge of the table. Assuming that the height of the table is greater 
than L, determine the motion of the cable until it leaves the table top. 

22 A perfectly flexible cable of length L, weighing w lb/ft, hangs over a pulley as shown in 
Fig. 2.9. The radius of the pulley is R, and its moment of inertia is I, Friction between the 



cable and the pulley prevents any relative slipping, although the pulley is free to turn 
without appreciable friction. At f = 0 the cable is released from rest in a position in which 
the portion hanging on one side is a ft longer than that hanging on the other. Determine 
the motion of the cable until the short end first makes contact with the pulley. 

23 Neglecting friction and assuming angular displacements 0 so small that 0 is a satisfactory 
approximation to sin 6 and 0*/ 2 is a satisfactory approximation to 1 — cos 0, find the 


SEC. 2.6 


APPLICATIONS 


65 


natural frequency of the system shown in Fig. 2.10, if the bar is of uniform cross section, 
absolutely rigid, and of weight w. 



24 Neglecting friction and assuming angular displacements 6 so small that 6 is a satisfactory 
approximation to sin 6 and 0 2 /2 is a satisfactory approximation to 1 — cos 0, find the 
natural frequency of the system shown in Fig. 2.11, if the bar is of uniform cross section, 



absolutely rigid, and of weight w. In what significant respect does this system differ from 
that discussed in Exercise 23? 

25 a Two disks each of moment of inertia I are connected by an elastic shaft of modulus k, 
that is, a shaft which requires k units of torque to twist one end through an angle of one 
radian with respect to the other end. The system is mounted in a frictionless bearing, as 
shown in Fig. 2.12. Neglecting the moment of inertia of the shaft, find the natural frequency 


/ I 



with which the disks will oscillate if they are twisted through equal but opposite angles 
and then released. 

b What is the natural frequency of the system if the moments of inertia of the disks are 
respectively I x and f 2 ? (Hint: Not only does the total energy of the system remain con- 
stant, but so does the total angular momentum.) 



CHAPTER THREE 


Simultaneous 
Linear Differential 
Equations 


introduction In many problems in applied mathematics there are not one but 
several dependent variables, each a function of a single independ- 
ent variable, usually time. The formulation of such a problem in 
mathematical terms frequently leads to a system of simultaneous 
linear differential equations, as many equations as there are 
dependent variables. 

There are various methods of solving such systems. In one, 
which bears a strong resemblance to the solution of systems of 
simultaneous algebraic equations, the system is reduced by suc- 
cessive elimination of the unknowns until a. single differential 
equation remains. This is solved, and then, working backward, 
the solutions for the other variables are found, one by one, until 
the problem is completed. A second method, which amounts to 
considering the system as a single matric differential equation, 
generalizes the ideas of complementary function and particular 
integral and, through their use, obtains solutions for all the 
variables at the same time. Finally, the use of the Laplace trans- 
formation provides a straightforward operational procedure for 
solving systems of linear differential equations with constant 
coefficients, which is probably preferable in most applications to 
either of the other methods. 

In this chapter we shall attempt through examples to present 
the first two methods, leaving the third to Chap, 7, where we shall 
discuss the Laplace transformation and its applications in detail. 


SEC, 3.2 


THE REDUCTION OF A SYSTEM TO A SINGLE EQUATION 


67 


3.2 

The reduction of a system to a single equation 


( 1 ) 


(2) 


(8) 


(4) 


Consider the following system of equations: 

+ * + »$+ 
s+ & + 1+^-' 

If we subtract twice the second equation from the first, we obtain 


If we subtract the second equation from 5 times the first, we 
obtain 


Finally, if we differentiate Eq. (1) and add it to Eq. (2), all 
occurrences of x will be eliminated and we shall have an equation 
in y alone: 


d*y dy 
dt 2 ^ dt 


"2 y 


4e~‘ — t — 2 


It is now a simple matter to solve this equation by the methods of 
Chap. 2, and we find without difficulty 

y = Ci«‘ -f C 2 e~ 2 * + | + §- 2e -< 


Various possibilities are available for finding x. By far the 
simplest is to use Eq. (1), which gives x directly in terms of y and 
its derivative. Thus, 

x “l(jt- 13y + 2t -‘ r ) 

= l - 2c2G~~ l + i + 2e-^ 

- 13 (cie 1 + c 2 e~ 2t + | + ~ - 2e~^ + 2 1- tr* j 


Equations (3) and (4) constitute a complete solution of the 
original system. 

In general, the steps in the reduction of a system of equations 
to a single equation are not so obvious as they were in the example 
we have just worked. For this reason it is frequently convenient 
to rewrite the given equations in the D notation. Then, if we 
regard the operational coefficients of the variables as ordinary 


SIMULTANEOUS LINEAR DIFFERENTIAL EQUATIONS 


CHAP. 3 


68 


algebraic coefficients, the method of elimination will usually be 
apparent. Still more systematically, determinants can be used to 
obtain the single equation satisfied by any one of the unknowns, 
very much as in the case of linear algebraic equations. 

Suppose, for definiteness, that we have the second-order 
system 


(a n D 2 + bnD + cii)x + (ai2 D 2 + bnD + cu)y — 

( dnD 2 + bnD + C2i).r + (anD 2 + bnD + Cn)y = faty) 


or, more compactly, 

Pn(D)x + Pu(D)V - 0 1(0 
Pn(D)x + Pn(D)y = <h(t) 

where the P’s denote the polynomial operators which act on x 
and y. If these were, as indeed they appear to be, two algebraic 
equations in x and y, we could eliminate y at once by subtracting 
Pu(D) times the second equation from Pn(D) times the first 
equation, getting 


0 


( 5 ) [Pu(D)P n (D) - Pu(D)P n (D)]x - P n (D)<h(t) - Pn(D)Mt) 

Moreover, 'this procedure is clearly valid even though the system 
consists of differential equations rather than algebraic equations. 

For “multip tying” the first equation by 

P n(D) s dnD 2 -(- bnD -j- C22 

is simply a way of performing in one step the operations of adding 
022 times the second derivative of the equation and times the 
first derivative of the equation to e 22 times the equation itself, and 
these steps are individually well defined and completely correct. 

Similarly, “multiplying” the second equation by 

Pu(D) m a u D 2 + b n D + ci. 

merely furnishes in one step the sum of <112 times the second 
derivative of the equation, 6x2 times the first derivative of the 
equation, and cn times the equation itself. Finally, the subtrac- 
tion of the two equations obtained by the “multiplications” we 
have just described eliminates y and each of its derivatives 
because these operations produce in each equation exactly the 
same combination of y and its various derivatives. Similarly, of 
course, x can be eliminated from the system by subtracting Pn(D) 
times the first equation from P 11(D) times the second, leaving a 
differential equation from which y can be found at once. 

The preceding observations can easily be formulated in 

determinant notation. In fact, the (operational) coefficient of x in 
Eq. (5) is simply the determinant of the (operational) coefficients 


!EC. 3.2 


THE REDUCTION OF A SYSTEM TO A SINGLE EQUATION 


69 


of the unknowns in the original system, namely, 

|P 2 i(I>) P»(D) I 

Furthermore, the right-hand side of (5) can be identified as the 
expanded form of the determinant 
I Pu(D) I 
I 4>z(t) P2l(D) I 

provided we keep in mind that the operators P 12(D) and P 22(D) 
must operate on <f> 2 (t) and 4>i(t), respectively, and hence the 
diagonal products must be interpreted to mean 
P 22(D) cf>!(t) and P n(D) <f>i(t) 

and not 4>i(t)P22(D) and <f> 2 (t)Pi2(D) 

Thus, Eq. (5) can be written in the form 


P 11(D) Pu(D) I 

<£i (0 

Pu(D) 

P*i(D) P 22 (I>) * ” 

02 OO 

P 22(D) 


which is precisely what Cramer’s rule (Theorem 7, Sec. 10.5) 
would yield if applied to the given system as though it were 
purely algebraic. In just the same way, the result of eliminating x 
from the original system, namely, 

[Pn(D)P22(D) - P i 2(D)P i x(D)]y = Pn(D)^(t) - Pn(D)4>i(t) 

can be written 

I P n (D) P 12(D) I I P n (D) 0i(<) I 

I P 2.1(D) P22(D)\ V I P iX (D) 0 2 (/)| 

The use of Cramer’s rule to obtain the differential equations 
satisfied by the individual dependent variables is in no way 
restricted to the case of two equations in two unknowns. Exactly 
the same procedure can be applied to systems of any number of 
equations, regardless of the degrees of the polynomial operators 
which appear as the coefficients of the unknowns. Moreover, as 
Eqs. (6) and (7) illustrate, the polynomial operators appearing in 
the left members of the equations which result when the original 
system is "solved” for the various unknowns are identical. Hence 
the characteristic equations of these differential equations are 
identical, and, therefore, except for the presence of different 
arbitrary constants, the complementary functions in the solutions 
for the various unknowns are all the same. The constants in these 
complementary functions are not all independent, however, and 
relations will always exist among them serving to reduce their 
number to the figure required by the following theorem :* 


* For a proof of this result see, for instance, E. L. Ince, “Ordinary Differen- 
tial Equations,” pp. 144-150, Dover Publications, Inc., New York, 1944. 


70 


SIMULTANEOUS LINEAR DIFFERENTIAL EQUATIONS 


CHAP. 3 


THEOREM 1 

If the determinant of the operational coefficients of a system of n linear differ- 
ential equations with constant coefficients is not identically zero, then the number 
of arbitrary constants in any complete solution of the system is equal to the 
degree of the determinant of the operational coefficients, regarded as a poly- 
nomial in D. In particular cases in which the determinant of the operational 
coefficients is identically zero, the system may have no solution or it may have 
solutions containing any number of arbitrary constants. 

The necessary relations between the constants appearing initially 
in the solutions for the unknowns can always be found by substi- 
tuting these solutions into all but one of the n equations in the 
original system (though not necessarily each set of n — 1 equa- 
tions) and then equating to zero the net coefficients of the terms 
that occur in each of these equations. 


EXAMPLE 1 

Find a complete solution of the system 

(3D 2 + 3D + 2)x + ( D 2 + 2D + 3 )y = e* 

1 (2D* - D-2)x + (D*+ D-b 1)|/ « 8 

From the preceding discussion vve know that the equation satisfied by a? is 


3D 2 + 3D + 2 

D 2 + 2D -f- 3 a*. 

x — 

D s + 2D + 3 

2D 2 - D - 2 

D 2 + D + 1 8 

D 2 + D + 1 


or, expanding the determinants and operating, as required, on the known functions e l and 8,* 


(D* + 3D’ + 6D S + 12 D + 8)ar - 3e‘ - 24 

The roots of the characteristic equation of this differential equation are — 1, —2, ±2 i Hence 
the complementary function is 

ci er* -f cae~ 5< + Ca cos 2 ( + c 4 sin 2t 
It is easy to see that 



is a particular integral, and therefore 

(9) x = cie -< + cie~ 2i + C3 cos 2{ + c 4 sin 2t -f- — 3 


In carrying out these expansions it must be borne in mind that the opera- 
tional elements in the determinant on the right operate on the algebraic 
elements e‘ and 8, whereas the elements in the determinant on the left all 
operate on a: and not on each other. This is the reason why in expanding 
the determinant on the right we have reductions such as 

D 2 8 <= 0 and 2D8 = 0 

whereas in expanding the determinant on the left we have only formal 
multiplications such as 

2D 2 3 — 6D 2 and D 2 (— 2) — —2D 2 


SEC. 3.2 


THE REDUCTION OF A SYSTEM TO A SINGLE EQUATION 


71 


The solution for y can now be found by substituting the last expression into either of the 
original equations and solving the resulting differential equation for y. However, it is usually a 
little easier to use Cramer’s rule again. Doing this, we find that y must satisfy the equation 


3D 2 + 3D + 2 

D 2 + 2D + 3 

3D 2 + 3D + 2 e‘ 

2D 2 - D - 2 

D 2 + D + 1 V ~ 

2D 2 - D - 2 8 


or ( D i + 3D 3 + 6D 2 + 12 D + 8 )y = + 16 

The solution of this presents no difficulty, and we find at once that- 

(10) y ~ kie~* + * 2 e~ 21 + * 3 cos 2t + * 4 sin 2t + ~ 4- 2 

However, Eqs. (9) and (10) do not yet constitute the solution of the given system, since 
collectively they contain eight arbitrary constants, whereas, according to Theorem 1, the com- 
plete solution of (8) can contain only four constants. To accomplish the necessary reduction in 
the number of constants we must now substitute from (9) and (10) into either one or the other 
(i.e., into all but one) of the original equations, say the second: 

(2D 5 - D - 2) ^ Cl e-‘ + cse- 2 * + c 3 cos 2t + c« sin 2t+-~-zj 

+ (D 2 + D + 1) ^*i<r' -f * 2 <r 2< + k 3 cos 2 1 + * 4 sin 2t + ^ 4- 2^ = 8 

or, performing the indicated differentiations and collecting terms, 

(ci + A,)e-‘ + (8c 2 + 3 k 3 )e~» -f (~10c 3 - 2c 4 - 3h + 2* 4 ) cos 2f 

-f (2c s - 10c 4 - 2*a - 3* 4 ) sin 2 1 = 0 

As it stands, with all eight constants completely arbitrary, this equation is not identically 
satisfied.* It will be an identity if and only if 

ci + *i—0 
8 c 2 + 3* 2 = 0 
-10c 3 - 2c 4 - 3*3 + 2*4 = 0 
2c 3 - 10c 4 - 2*3 - 3*4 = 0 

From these we find (among many equivalent possibilities) 

*i - -ci * 2 = — Hd * 3 = — 2(cs + c/t) ki ~ 2(c 3 — c 4 ) 

Hence, the complete solution to our problem is the pair of functions 

x =■ cie~‘ + c 2 e~ 2t + c 3 cos 2t + c 4 sin 2f + — 3 

8 . e‘ 

y = —Cier‘ - - c 3 e~ u — 2(c 3 + c 4 ) cos 2 1 + 2(c3 — c 4 ) sin 2< + — 4- 2 

Though tedious, it is perfectly straightforward to verify that these expressions satisfy the first 
of the original pair of equations without additional restrictions on the constants. 


* The reason we encountered no such difficulty in our first illustrative exam- 
ple was that we were able to find x from an equation giving it explicitly in 
terms of y and its first derivative, and did not have to solve a second differ- 
ential equation, thereby introducing additional constants. 


72 


SIMULTANEOUS LINEAR DIFFERENTIAL EQUATIONS 


CHAP. 3 


EXAMPLE 2 

Solve the system of equations 

Dx + (D - l)y + ( D + 2)2 = 2e* 

(D - l)a: + Dy + (D - 2)z = ae‘ 

(D + D^ + (D - 2)y + (D + 6)2 = e‘ 

From the preceding theory we expect that the differential equation satisfied by 2 is 


D 

D - 1 

D + 2 


D 

D - 1 2c' 

D - 1 

D 

D - 2 

2 - 

D - 1 

D ae' 

D + 1 

D - 2 

D + 6 


D + 1 

D - 2 c' 


However, expanding the determinants and operating, as required, on the known functions 
2e‘, ae‘, and c e , we obtain 

0 • 2 = (a - 3)e' (1) 

Clearly, unless a - 3, this equation, and hence the system itself, has no solution. On the other 
hand, if a — 3, this equation is satisfied by any function 2. In fact, if a — 3, it is easy to verify 
that the third equation in the given system is equal to twice the first equation minus the second. 
Hence, when a = 3, the last equation is dependent upon the first two and is automatically 
satisfied by any functions x(t), y(t), s(t) which satisfy them. Thus, considering only the first 
two equations, we can write 

Dx + (D - 1) y = 2e‘ - (IJ + 2)z 
(D - lj*' + Dy - 3c' — (D — 2)z 

and, for every differentiable function 2, this system can bo solved for x and y. Specifically, 

D D - 1 I I 2e' ~ (D + 2)2 D - 1 I 
D - 1 D I* ~| 3e‘ — (D — 2)2 D | 
or 

(U) (2D - D* - 2c' - (5 D - 2)2 

1 d Z> — 1 1 ID 2e' - (D + 2)2 

and \y = 

| D - 1 D I | D - 1 3c' - (D - 2)2 
or 

(12) (2D - Dz/ = 3c' + (3D - 2)2 

From Eqs. (11) and (12), * and y may be found in terms of 2. Moreover, since z is subject only 
to the restriction that it be differentiable, it may contain any number of arbitrary constants, 
and hence, when a — 3, but not otherwise, the solution of the original system may also con- 
tain any number of arbitrary constants, as asserted by Theorem 1. 

EXERCISES 

With the understanding that D s= — > find a complete solution of each of the following systems 
of equations: 

1 (D + 5)x + (D + 4 )y =» e -< 

(D + 2)x + (D + l)y = 3 


2 (D + 5)* + (D + 3)y = e~* 
(D + 2)x + (D + l)y - 3 


SEC. 3.2 


THE REDUCTION OF A SYSTEM TO A SINGLE EQUATION 


73 


3 

5 

7 


11 


12 


13 


14 


16 


16 


17 


18 


19 


(D + 5)* 4 

(D 4 3)y = tr* 

4 

(2D • 

4 5)3 - 

(2D 4 3)ij 

= 1 

(2D + l)x 4 

(D 4- 1)?/ = 3 


(D- 

-2)3 4 

(D 4 2)2/ 

= 0 

(D 4 2)x 4 

(D - I)?/ = 0 

6 

(D ■ 

4 1)3 4 

(D 4 2)1/ 

= e l 

(2D 4 3)a 4 

(3D 4- l)y = 5 sin 21 


(2D ■ 

4 7)3 4 

(D 4 10)2/ 

— fi 2< 

(2D 4 3)x 4 

(D 4- 4)r/ = 0 

8 

(2D 2 

4 1)* 4 

- (D 4 2)1/ 

= 5 

(D 4 1)3 4 

(D 4 2)2 / = 0 


(D 2 - 

- 16)3 4 (D - 4)y . 

= 4 

(9D 2 4 8)ae 4 

■ (3D 2 4 4)2/ - 0 

10 

(2D 2 

- D - 

1)3 4 (D - 

• 1)2/ - 0 

(2D 2 4- 1)3 4- (D 2 4- 2)j/ = 12 cos 31 



(D 2 - 

1)* 4 (D - 

■ 1)2/ - 0 

(D 4- 1)3 + 

2/4- 22 = 1 






x + (D + 2)y + a = e~‘ 

42 





5x 4 

y + (D - 2) Z = 5e" 

•* 4 1 





(D - 1)3 - 

V = 1 






-23 4- (D - 1)2/ - 2 = 0 







-2y 4 (D - 1)2 = «* 






(D + 1)3 + 

(D 4 5)2/ 4 (2D 4 5)2 = 

15e‘ 





(2D + 1)3 + 

(D 4 2)2/ 4 (3D 4 l)a - 

10e‘ 





(D 4 3)x 4 

(3D 4 4)2 / 4 (4D 4 6)2 = 

21e‘ 





(D 4 1)3 4 

(D 4 3)?/ 4 (2D 4 3)2 

= e‘ 





(2D 4 1)3 4 

(D 4 2)2 1 4 (3D 4 1 )* 

= 0 





(D 4- 3)3 4 

(3D 4 11)2/ + (4D 4 13)2 

= 0 





(D 4- 1)* 4 

(D 4 1)2/ 4 (2D 4 3)2 = 

0 





(2 D + 1)3 + 

(D 4 2)t/ 4 (3D 4 5)z = 

0 





(D 4- 3)3 + 

(3D 4 1)2/ 4 (4D 4 5)z = 

0 





If (3i,2/0 and 

( 32 , 2 / 2 ) are two solutions of 

the system 





Pii(D)x +P n (I>)y ■- 0 







P 21 (D)3 4 P 22 (D) 2 / = 0 







prove that (cixi 4 csxt, ciyi 4 ayt) is also a solution of this system. 

Find a system of differential equations having 

x = Ae~‘ 4 Be 1 4 Ce 2t 
y = Ae~‘ - Be 1 4 2 Ce u 
as a complete solution. 

In Example 1, determine multiples of the two equations which, when added, will yield an 
equation expressing y directly in terms of x and its various derivatives. Do you think that 
this can be done in general? 

Two tanks are connected as shown in Fig. 3.1. The first tank contains 100 gal of pure water; 
the second contains 100 gal of brine containing 2 lb of salt per gal. Liquid circulates through 



the tanks at a constant rate of 5 gal per min. If the brine in each tank is kept uniform by 
stirring, find the amounts of salt in the respective tanks as functions of time. 


74 


SIMULTANEOUS LINEAR DIFFERENTIAL EQUATIONS 


CHAP. 3 


20 Three tanks are connected as shown in Fig. 3.2. The first tank contains 100 gal of pure 
water; the second contains 100 gal of brine containing 1 lb of salt per gal; the third contains 
100 gal of brine containing 2 lb of salt per gal. Liquid circulates through the tank at a 
constant rate of 5 gal per min. If the brine in each tank is kept uniform by stirring, find the 
amounts of salt in the respective tanks as functions of time. 



3.3 

Complementary' functions and particular integrals 
for systems of equations 

To illustrate the extension of the ideas of characteristic, equation , 
complementary function, and particular integral to systems of 
differential equations, let us consider the following set of 
equations: 

(D + l)z + (D + 2 )y + (D + 3)* - -e~‘ + 8/ + 2 

(1) (D + 2)x + (D + 3 )y + (2D + 3 )z - <r‘ + 11/, - 1 

(4 D + 6)* + (5 D + 4 )y + (20D - 12)0 = 7e~> + 2 1 

As in the case of a single equation, we shall first make the 
system homogeneous by neglecting the terms on the right, getting 

(D + l)* + (D + 2)y + (D + 3)0 = 0 

(2) (D + 2)z+ (D + 3)y + (2D + 3)0 = 0 

(4D + 6)* + (5D + 4)y + (20D - 12)0 = 0 

Guided by our experience in solving single equations, let us now 
attempt to find solutions of this system of the form 

(3) x = ae mt y = be mi 0 = ce mt 

Substituting these into the equations in (2) and dividing out the 
common factor e mt leads to the set of algebraic equations 

(m + l)a + (to + 2)6 + (to + 3)c = 0 

(4) (to + 2 )a + (to + 3)6 + (2 to + 3)c = 0 

(4to + 6 ) a + (5 to + 4)6 + (20m - 12)c = 0 



SEC. 3.3 


COMPLEMENTARY FUNCTIONS AND PARTICULAR INTEGRALS 


75 


If nontrivial solutions for x, y, and 2, i.e., solutions that do not 
vanish identically, are to be obtained, it is necessary that a, b, 
and c shall not all be zero. However, the values a = b — c — 0 
obviously satisfy the system (4) and in general will be the only 
solution of this set of equations. In fact, from college algebra (or 
from Corollary 1, Theorem 7, Sec. 10.5) we know that no other 
solutions are possible unless the determinant of the coefficients in 
(4) is equal to zero. Thus we must have 

m + 1 m + 2 m + 3 

ffl + 2 m + 3 2m + 3 = — (m — l)(m — 2)(m — 3) = 0 

4m + 6 5m + 4 20m — 12 

This equation, which defines all the values of m for which non- 
trivial solutions of (4), and hence of (2), can exist, is the charac- 
teristic equation of the system. It is, of course, nothing but the 
determinant of the operational coefficients of the system equated 
to zero, with D replaced by m. 

From the roots of this equation, mi = 1, m 2 = 2, m3 = 3, we 
can construct three particular solutions 


ix 1 - aie‘ 

1 Xi — a 2 e 2 ‘ 

1 x 3 - a 3 e 3i 

j 2 /i = he f 

| y 2 = b 2 e n 

•j 2/3 = & 3 C 3t 

’ Z\ — cie* 

' Z2 — c 2 e 2 ' 

' Z3 = c 3 e s< 


provided that we establish the proper relations among the con- 
stants in each of the three sets. 

To do this, we note that the constants o», hi, d must satisfy 
the equations of the system (4) for the corresponding value m,-. 
Thus for mi = 1 we must have 

2gi + 36i + 4ci = 0 
3 ai + 46i + 5ci = 0 

lOai + %t + 8ci = 0 

We know, of course, that the determinant of the coefficients of 
this system is equal to zero. Hence these equations are consistent, 
i.e., have a solution other than ai = 61 = ci = 0, and it is easy to 
verify that, for all values of k h they are satisfied by 

a,\ — — k% bi = 2ki Ci — —hi 

Therefore, for each value of k h 
Xi — ~kie‘ 

(6) y% = 2kie‘ 

zi = ~kie l 

is a particular solution of (2) corresponding to the characteristic 
root mi = 1. 


76 


SIMULTANEOUS LINEAR DIFFERENTIAL EQUATIONS 


CHAP. 3 


Similarly, for m 2 = 2, we have from (4.) 

8 a 2 + 46 2 + 5 c 2 = 0 
4a 2 ~h 56 2 -I - 7 c 2 — 0 
14a 2 4" 146 2 -f- 28c 2 — 0 

and it is easy to verify that, for all values of k?., these are satisfied 
by 

a 2 = 3 /c 2 bi = —k 2 c 2 = — /c 2 

Therefore, a second family of particular solutions of (2) is 

x% = 3 fc 2 e a 

(7) y 2 - -he** 
z 2 — — k 2 e u 

Finally, for m z = 3, we have from (4) 

4o 3 + 56 3 + 6c 3 = 0 
5o 3 "I - 66 3 -f" 9 c 3 — 0 
18o 3 + 196 s + 48c 3 = 0 
and o 3 = 9& 3 6 3 = — 6/c 3 c 8 = — fc 3 

A third family of particular solutions of (2) is, therefore, 
x z - 9/c 3 e 8< 

(8) y 3 = — 6fc 3 e 3 * 

*s « — k 3 e u 

Since the equations of the homogeneous system (2) are all 
linear, sums of solutions will also be solutions. Hence we can com- 
bine the three families of particular solutions (6), (7), and (8) 
into a complete solution of (2) : 

x ~ xi •+■ x 2 + x 3 = —fae 1 4* Zfae zt 4* 9 k 3 e 31 

(9) y ~ yi 4- yz 4- y» = 2kye* — k 2 e u — 6 he 31 
z — Z\ 4- z 2 4* 2a — — k\e t — k 2 e n — & 3 c Si 

This is the complementary function of the original nonhomogeneous 
system (1). We note that it contains precisely three arbitrary 
constants, as required by Theorem 1, Sec. 3.2. The relations 
among the nine constants originally present in the three par- 
ticular solutions (5) could also have been found by substituting 
those solutions into any two of the equations of the homogeneous 
system (2) and equating coefficients, as we did in Example 1, 
Sec. 3.2. 

To complete the problem we now need to find a particular 
solution, or “integral,” of the nonhomogeneous system (1). To do 
this, we assume for x, y, and z individual trial solutions exactly as 


SEC. 3.3 


COMPLEMENTARY FUNCTIONS AND PARTICULAR INTEGRALS 


77 


described in Table 2.2, Sec. 2.3. Thus, in the present case we 
choose 

X = ai&~ 1 + a4 + a.i Y = lhe~ l + 82 * -T 83 Z — 7 ie~ { + 72 1 + 7a 
Substituting these into (1) and collecting terms, we find 
(8i 4" 27 i)e~' + (a« + 28a + 3y 2 )* 

+ (a 2 4- 82 4- 72 4- «3 + 2/3 3 + 373 ) = — e~* -f- St -f 2 
(«i + 2/3 1 + 7i)e~‘ + (2 an 4- 3/3 2 + 372 )* 

+ (as 4" 82 4" 27 2 + 2«3 4" 3/8a 4~ 373 ) = e _e + Hi — 1 
( 2 «i — /3i - 32 T i)e-‘ + ( 6 a 2 4- 48 2 - 127*)* 

4" (4a : 2 4“ 5/3 2 4~ 207 2 4" 6^3 4" 4/3a — 1273 ) = 7e~ 1 4" 2 i 
Clearly, these three equations will hold identically if and only if 
the following sets of conditions are satisfied : 

/ 8 a 4- 2 7 i = -1 

(10) ai 4* 2/Si 4- 71 = 1 

2«i- ft- 32 7 i= 7 

«a4". 2ft.4* 372 = 8 

(11) 2« 2 + 3ft+ 372 = 11 
6012 4* 4ft — 1272 = 2 

an 4- 82 4- 72 + «3 4- 2ft 4- 373 = 2 

(12) «2 4 ft 4 ~ 272 4 * 2a!3 4 " 3 ft 4 - 373 = — 1 

4 a 2 4 ” 5ft 4 ” 2O72 4 " 6an 4 ~ 48s — 1273 = 0 
From the set ( 10 ) we find without difficulty that 
an = 3 81 = —1 71 — 0 

From (11) we find that 
as ~ 1 82 = 2 72 = 1 

Finally, from (12), after the values for a 2 , 82 , and 72 are inserted, 
we find that 

a# - —3 83 = “I 7s = 1 

With these values for the constants, the particular integral of 
the nonhomogeneous system ( 1 ) becomes 
X = 3e~ ( 4- t - 3 Y = -tr* + 2t - l Z = t + 1 
Hence, adding these to the respective components of the comple- 
mentary function (9), we have the complete solution of the 
original system: 

x = —kie 1 + 3 k 2 e 2t + 9 k s e u 4- 3e“' 4- t - 3 

y = 2 kie 1 — /c 2 e 2 * — 6 fte 3t — e~ l 4 - 2 * — 1 

2 = — fte* — kse 2t — fte 3i 4- * 4" 1 ■ 



78 


SIMULTANEOUS LINEAR DIFFERENTIAL EQUATIONS 


CHAP. 3 


EXERCISES 

Find a complete solution of each of the following systems: 


1 (2) + 2)x + (2) + 4)2/ = 1 

(2) + l)x + (2) + 5)y « 2 
3 (D 4- 1)7 + (42) ~2)y - t - 1 
(D + 2)x + (5 D ~ 2)y - 2f ~ 1 
5 (22) + 1)* + (2) + 2)y - e‘ 

(2) + 2)* + CD + 4 )y = e~‘ 


2 (22) + 1)7 + (2) + 2)2/ - 0 
(2) + 3)* + (2) + 6)?/ - -3e‘ 
4 (2) + 5)7 + (2) + 7)2/ = 8e 2< 

(2D + 1)7 + (32) + 1)?/ = 0 

6 (22) + 1)7 + (2) - I)?/ = cos t 

(D + 2)7 + (2) + 3)2/ - 0 


7 (22)* + 5)7 + (2)* + 3)2/ = 1 

(D s + 7)7 + (2)* + 5)y = t (Hint: Assume first x = a cos \t, y — b cos \t, and then 
7 = c sin Xt, y — d sin X t, where X is a parameter to be determined.) 

8 (22)* + 7)7 + (2>* + 5)2/ = e"‘ 9 (22)* + 15)7 + (2)* + 12)2/ = cos t 

(32)* + 13)7 + (32)* + ll)y - tor* + 12 (32)* + 26)7 + (32)* + 28)y - 0 

10 (22) + 11)7 + (2) + 3)y + (2) - 2)8 = I4e‘ 

(2) - 2)7 + (2) - 1)2/ + 2)? - — 2e* 

(2) + 1)7 + (D - 3 )y + (22) - 4)z - 4e‘ 


CHAPTER FOUR 


Finite Differences 


4.1 

The differences of a function 

In the last three chapters we have developed methods for the 
solution of several large and important classes of differential 
equations. There are, of course, other families of equations for 
which exact solutions can be found, but in general, differential 
equations more complicated than the simple ones we have been 
considering must be solved by approximate, numerical methods. 
Among the most important of these are what are known as finite- 
difference methods. Since finite differences also occur in other 
branches of numerical analysis, such as interpolation, numerical 
differentiation and integration, curve fitting, and the smoothing 
of data, it is desirable that an applied mathematician have some 
familiarity with them, and the present chapter is devoted to this 
end. 

Suppose that we have a function y = f(x) given in tabular 
form for a sequence of values of x: 


* 1 

m 

Xt> 

/(3a) 

Xi 

fix l) 

Xi 

/(3a) 

38 

fix,) 


If f(xi ) and f(xj) are any two values of /(re), then the first divided 
differences of f(x) are defined by the formula* 

(i) 

r Vi — Xj 

Similarly, if fixi,x } ) and f(xj,x k ) are two first differences of f(x) 


* In most, applications the subscripts of the arguments Xi and xj will be 
consecutive integers, but this is not a necessary restriction on the definition. 


FINITE DIFFERENCES 


CHAP. 4 


having one argument, Xj, in common, then the second divided dif- 
ferences of f(x) are defined by the formula 

/( Wi) 

Proceeding inductively, a divided difference of any order is 
defined as the difference between two divided differences of the 
next lower order, overlapping in all but one of their arguments, 
divided by the difference between the extreme, or nonoverlapping, 
arguments appearing in these differences.* From these definitions 
it is clear that divided differences have the following properties : 


PROPERTY 1 

Any divided difference of the sum (or difference) of two functions is ecpial to the 
sum (or difference) of the divided differences of the individual functions. 


PROPERTY 2 

Any divided difference of a constant times a function is equal to the constant 
times the divided difference of the function. 


In many applications it is convenient to have the divided 
differences of a function prominently displayed. This is usually 
done by constructing a difference table in which each difference 
is entered, in the appropriate column, midway between the ele- 
ments in the preceding column from which it is constructed : 


X 

fix) 

Xo 

fix o) 


fix o,Xi) 

Xi 

fix i) fix o,Xi,as 2 ) 


fix i,X t ) 

Xt 

fixi) fix 


fiXa,Xz) 

xa 

fix,) 


fixn.xuxu.xt) 


or in a specific numerical example, 


X X* 

0 0 

1 

1 1 4 




13 


1 

3 

27 


8 




37 


1 

4 . 

64 


14. 




93 


1 

7, 

343 


20 




193 



9 | 

729 





0 

0 


* Though obvious only for divided differences of the first order, it is true (see 
Exercises 13 and 14) that divided differences of all orders are symmetric func- 
tions of their arguments. Thus, f(xi,Xj,z k ) = f(x { ,x k) xj) /(%,*<,*#) * ■ • • . 


SEC. 4.1 


THE DIFFERENCES OF A FUNCTION 


81 


(3) 

(4) 
(6) 


(6) 


Usually the values of £ in a table of data will be equally 
spaced, and the differences of the function will be based on sets of 
consecutive functional values. When this is the case, the denomi- 
nators in the divided differences of any given order are all the 
same, and it is customary to omit them. This leads to a modified 
set of quantities known simply as the differences of the function. 
If the constant difference between successive values of x is h, so 
that the general value of x in the table is 


x k = t 0 + kh k = . . . , -2, -1, 0, 1, 2, . . . 
and the corresponding functional value is 
Vk - f(x k ) = /(.To 4 -kh) = fk 

then the first differences of / are defined by the formula 
A/t = fk+i — fk 

Differences of higher order are defined in the same way, the 

second differences being 

A 2 /* = A (A/,) - Afk+i ~ A f k 

and, in general, for positive integral values of n, 

A n fk = A(A n ~ 1 f k ) = A” -1 /fc + x - A n-1 /fc 

These differences are also displayed in difference tables just like 
divided differences. 

Evidently the difference operator A has the characteristic 
properties of a linear operator, for 
A (fk ± gk) = (fk+i ± gk+i) — (fk ± g k) 

= (fk+i - fk) ± (ffk+i - gi) = A fk ± A g k 
and, if c is a constant, 

A (c/s) = c/fc+i - cf k - c(fk+ 1 — fk) - c A fk 
IMoreover, A obeys the usual law of exponents 
A m (A n fk) = A m+n fk 


provided both m and n are positive integers. 

When the values of the independent variable are equally 
spaced, the divided differences of a function can easily be ex- 
pressed in terms of ordinary differences and vice versa. 
Specifically, 


/(* 0,Ti) 


/(to) ~ f(xx) 

To — Ti 


fo-fi 
— h 


To — T 2 


and, in general, 

/(t 0 ,Ti, . . . ,x n ) - 


A"/o 

n\h n 


_ Afo 
h 

l_ (Afo _ A/i\ = Aj/o 
2 h\h h ) 2\h z 


More generally, if the points used in constructing an nth divided 
difference are the n + 1 equally spaced points between to — kh 


82 


FINITE DIFFERENCES 


CHAP. 4 


and x 0 + (n — k)h, inclusive, it is easy to show that 

(7) f(X-lc,X~k+ 1, • • . ,Xn-k) — --,^T 

The A symbolism for the differences of a function is known as 
the advancing difference notation. In some applications, however, 
another notation known as the central difference notation is more 
convenient. In this, the symbol S is used instead of A, and the 
subscript appearing in the symbol for any difference is the 
average of the subscripts already assigned by this convention to 
the elements which are subtracted in forming that difference. 
Thus, 

A fh = fk+1 — fk — Sfk+U Afk+i = /*+ 2 — fk + 1 = Sfk+M 
A 2 /* = A fk+i - Afk - 8f k . m - 5fk+yj = d%+i 


The following difference tables show the relation between the 
advancing and the central difference notations: 


X 

f 




a; 

f 




Xo 

U 

Aft 



Xo 

ft 




Xi 

h 

A/, 

A^o 

A •/„ 

*i 

/. 

sf% 



Xu 

h 


A 2 /i 

Affn 

Xi 

/a 


« a /a 




Aft 


A 3 /i 



sfn 


««/W 

X 3 

h 

Aft 

a»/ 2 


Xa 

/.i 

shi 

5 3 /a 


X4 

n 




x * 

ft 





In the first, elements with the same subscript lie on lines sloping 
downward, or advancing into the table. In the second, elements 
with the same subscript lie on lines extending horizontally, or 
centrally, into the table. 

Closely associated with A and 6 is the operator E, which is 
defined as the operator which increases the argument of a func- 
tion by one tabular interval. Thus, 

Ef(x k ) = f(x k + h) ~ f(x k +i) - f k+ i 

Applying E a second time again increases the argument of / by 
h ; that is, 

E 2 f(x k ) = E[Ef(x k )}~Ef(x k + h) = f(x k + 2/t) = /(**+,) - /*+« 
and, in general, we define 

(8) Ef(x k ) = f{xk + rh) - f(x k+r ) = f k + r 

for any real number r. Clearly, E obeys the laws 
E(fk ± g k ) = Ef k ± Eg k 

E(cfk ) = cEf k c a constant 

mEik) = E'+'fk 



SEC. 4.1 


THE DIFFERENCES OF A FUNCTION 


83 


Two operators with the property that when they are applied 
to the same function they yield the same result are said to be 
operationally equivalent. Now, from the definition of A /*, we have 
A/& = fk+i — fk = Efk — fk 
or, symbolically, A/* — (E — 1 )/ fc 
Hence, we have the operational equivalences 

(9) A = E - 1 

(10) E = 1 + A 

(11) E - A = 1 

Moreover, by definition, 

A fk — Sfk+yj. — dE n f k 

Hence, we have the further equivalences 

(12) A = 8E' A 

(13) 8 = A E~ Vl 

Also, substituting from (12) into (9) and solving for 5, we have 

(14) 5 = E' A - E~' A 

By means of (9) we can express the various differences of a 
function in terms of successive entries in the table of the function. 
For we can write 
A"/* ~ (E — 1 )»/* 

and then, using the binomial expansion, 

A V* - [jB* - (,) + (j) B '~ l +•'■ 

+ !)* + (-»■(:)]/.* 
= E% - nE n ~ l fk + ^ E»~*f k + • ■ • 

+ (-l)”~'nEf k + (-1)”A 

(15) = /fc+» ~ nfk+n-i + /*+»-* + • * ■ 

+ (-l)»-bi/* + i+ (-l)V* 

Specifically, taking /c = 0 and n = 1, 2, 3, 4, . . . , we have 

A/o = /1 - /o 

A 2 /o = /a - 2/i + /„ 

(15a) A 3 /o = /. - 3/2 + 3/i - /o 

A 4 /o =/i - t/s + 6/2 - t/i +/o 


f The quantities are the so-called binomial coefficients, definec 
formula (j ) - jjgrrjji' 



FINITE DIFFERENCES 


CHAP. 4 


If, further, we divide qt(x) by x — 2, we obtain a remainder r 2 
and a quotient q 2 (x) such that 

qi(x) = r 2 + (x — 2 )q 2 (x) 


and, substituting into (23), 

p(x) — r 0 + r— + x(x — l)[r 2 + (x — 2)q 2 (x)] 

= r 0 + ri(x) (1) 4- r s (x) (2) + x(x — l)(.r — 2 )q 2 (x) 

Each application of this procedure leads to a new quotient whose 
degree is one less than the degree of the preceding quotient. 
Hence, the process must terminate after n + 1 steps with the 
required expansion 

(24) p(x) « ?’ 0 + n(aO<» + r 2 (z)W + • • • + r R _— » + r— 

Obviously, the required divisions can easily be carried out by the 
elementary process of synthetic division. Moreover, it is clear 
from Eqs. (20), (21), and (24) that 



or 


(25) A*p(0) — jlt-j 

Hence, this method provides a convenient way of constructing 
the difference table of a polynomial in the important ease when 
h = 1, since it furnishes us with the leading entry in each column 
of the table and from these the table can be extended as far as 
desired by simple addition, using the identity 

A J-1 /fc+i = A?~ l fk + A’/* 


EXAMPLE 1 

Express p(x) = x 4 — 5x a + 3* -f- 4 in terms of factorial polynomials and construct the differ- 
ence table of the function for ft = 1. 

Using synthetic division we have at once 

1| 1—5 0 3 4 

1 -4 -4 

2| 1— -4 —4 — 1 

2 -4 

3| 1 -2 -8 

3 

11 


The remainders r D , r lt r», r g , r t are the underscored numbers 4, -1, —8, 1, 1. Hence 
p{x) s x 4 - 5x* + 3* + 4 *4 - (*)<» - 8(ar)< 2 > + (x)« + (x)W 
as can be verified by direct expansion. 

Now from (25) we have 

p(0) - 4 Ap(0) = -1 A*p(0) = -16 A a p(0) = 6 A 4 p(0) 24 


SEC. 4.1 


THE DIFFERENCES OF A FUNCTION 


Hence we have the leading entries in the difference table for p(x), and by “crisscross” addition, 
as indicated, the table can be extended and the values of p(x) determined as far as may be desired. 

x p(x) A A s A 3 A 4 



Once a function has been expressed as a series of factorial 
polynomials, it is a simple matter to apply Eq. (18) or (19) to 
obtain its various differences. Conversely, when a function has 
been expressed as a series of factorial polynomials, it is easy to 
use these equations “in reverse” and find a new function having 
the given function as its first difference. By analogy with the 
terminology of calculus, we shall refer to such a function as an 
antidifference. 

EXAMPLE 2 

What is the general antidifference of the polynomial p(x) = x* — 5x 3 + 3x + 4? 

From the results of Example 1 we know that 

p(x) = (a;) <4) + Cr)< 3 > - 8(a;p> - (a;)* 1 * + 4 

Hence, from Eq. (18), it is clear that the required antidifference, which is often denoted by the 
symbol A~ 1 p(x), is 

A _. , , (*) (6 > , (s)< 4 > 8(*)< 3 > (*)<*> , 

A l p(x) = — 1 b 4(a;)< 1 > + c 

5 4 3 2 

where c is an arbitrary constant which can, and in general must, be added, since the difference of 
any constant is obviously zero. The analogy between antidifferences and indefinite integrals 
or antiderivatives is clear. 

The determination of antidifferences is not just a mathe- 
matical curiosity, but is intimately related to the important 


FINITE DIFFERENCES 


CHAP. 4 


problem of finding the sums of series. To see this, consider any 
two consecutive columns in a difference table : 

A *fx 

A*+i/x 

A */ 2 

A fc+1 / 2 

A % 

A k+ % 

A k fn+1 

Now, from the definition of a difference we have 
| A ft+1 /i - (A */ 2 - A fc /i) 4- (A*/, - A fc /a) + • • • 

i = i 

4* (A*/» - A A / (l _i) + (A% +1 - A .*/„) 
or, canceling the common terms in the series on the right, 

(26) | A *■!/„ = A k f n +i - A^ 

Since the &th difference of a function is obviously an antidiffer- 
ence of the ( k 4- l)st difference, it is clear that Eq. (26) is equiv- 
alent to the following theorem : 


THEOREM 2 

If F(i) is any antidifference of f(i), then the sum from i 1 to i — n of the 
series whose general term is /(*) is F(n 4- 1) — F( 1). 

The analogy between this theorem and the fundamental theorem 
of integral calculus is unmistakable. 

EXAMPLE 3 

What is the sum of the squares of the first n odd integers? 

To facilitate the finding of the necessary antidifference, we first express the general term 
of the series, namely, (2 i — l) 2 , in terms of factorial polynomials: 

(2* - l) 2 = 4i(i ~ 1) +1 - 4(i)U> 4 1 
Then, by the last theorem, 

^ (2i - 1)* = ^ [4(f) (5) 4 1] - + (•)« J"" n+1 

4 (» 41) 4(1)0) 

- — - ■ ■ + (» + 1)(» - (1)U) 


4(ra + 1 )n(n — 1) 


4 (n + 1) — 0 — 1 


Ar? — n 


THE DIFFERENCES OF 


function 


89 


EXERCISES 

1 Prove Formulas (6) and (7). 2 Prove Formulas (18) and (1J)- 

3 Express the following polynomials in terms of factorial polynomials, and construct 
ence table for each function: 

. r 6 __ 2a: 4 + 4s 8 - % + 6 

a x 3 — x + 1 b x* — 2x 3 — * 

4 For each of the following difference tables, find the polynomial of minimum degiee which 
yields the given data: 




—4 


24 


18 

0 

14 

42 

24 

56 




5 If h = 1, show that, for all values of the constants a and b, each of the following functions 
satisfies the indicated relation: 


a y = 02* + 63* 
b y = a.2* + bx2 x 
c V = a3* + 6 (-2)* 


(B* - 5E + 6)j/ - 0 
(E* - 4 E + 4)?/ = 0 
(A s + A — 6 ) 2 / - 0 


6 Find the sum of the cubes of the first n integers. 

7 Show that A (f n g n ) - f n+lA g n + g n Af n = ff „ +l A/„ + /»Afif„. 

8 If /i = 1, show that A sin ax => 2 sin (a/2) cos a(a: + M) ftnd thftt 


A cos ax = —2 sin | sin a(a: -|- M) 

9 Express each of the following in terms of factorial “polynomials” of the type (x)-™'. 
' 1 x * ~ 1 
(* + 2)(*+S) b 7“+T)(r+2j C (s + D(* + 3 ) 

10 What is Y — _? 

ih\ (* + 1 )(* + 2 )(* + 3 ) 

11 Show that (*)<“> (x) (« ^ (ar)(«+M, but that (a: + a) (e) (®)< 6 > = (« + a) (o+6) - 

12 a Show that f y k = ^ and thenj by putting B - 1+ A, show that 

k = i h - 1 

| ». - r„ .+ »» - of* + • • ■]>'. 

& = l L 2! 3! J 

b Using the results of part a, evaluate J k 1 . 



FINITE DIFFERENCES 


13 Show that /(x 0 ,x,) = /(x 0 )/(xo — x,) + fix,) fix, - xo) and that 
fix o) , fix,) 


fix b,Xi,x 2 ) = 


fix 2 ) 


(x 0 — xOCxo — xf) (xi — x»)(xi — x 2 ) (x 2 — x 0 )(x 2 — x,) 

What is the generalization of these results to divided differences of higher order? 

14 Show that 

I fix a) fix,) I 
fix o,xi) = i 

Xo Xi 
1 1 


fix 0,Xi,X 2 ) = 


/(*•) 

Xo 

1 

fix 1 ) 

Xi 

1__ 

/(x 2 ) 

*2 

1 


Xo 2 

Xi® 

X» 2 



Xo 

Xi 

Xs 



1 

1 

1 



What is the generalization of these results to divided differences of higher order? 
16 If we define fix o,Xi, . . . ,x n -i,x„,x„) — lim fix o,xi, . . . ,x„_i,x»,x), show that 


fix OjXi, ... )X n -l,Xn,Xn) 


_ dfjx Q,Xi, . 


Interpolation formulas 

One of the most important applications of finite differences is to 
the problem of interpolation. In courses such as algebra and 
trigonometry, where tables of the elementary functions must 
occasionally be used, it is customary to obtain values between 
adjacent entries by the method of proportional parts or linear 
interpolation. As is well known, this procedure amounts to 
replacing the arc of the tabulated function over one tabular 
interval by its chord and then reading the required functional 
value from the chord rather than from the arc itself (Fig. 4.1a). 





SEC. 4.2 


INTERPOLATION FORMULAS 


91 


In this case the formula for the interpolated value turns out to be 

(1) fix o + rh) = f(x 0 ) + r[fix 0 + 6) — /bo)] = fo + r A/o 

Obviously, if h is relatively large or if the graph of fix) is 
changing direction rapidly, the chord may not be a good approxi- 
mation to the arc, and linear interpolation may involve a sub- 
stantial error. One way to overcome this difficulty would be to 
approximate the graph of f(x) by some curve which would “fit” 
the true arc more closely than a straight line could and then read 
the interpolated value from this approximating curve rather than 
from the chord (Fig. 4.16). If, specifically, the graph of fix) is 
approximated over two successive tabular intervals by a parabola 
of the form y — a + bx + ex 2 chosen to pass through the three 
points 

bo, fixo)] bo + h, fixo + 6)] bo + 2 h, fixo + 26)] 

the formula for the interpolated value is found without difficulty 
to be 

fix o + rh) = /bo) + r[fix 0 + h) — fix 0 )1 

+ [fix 0 + 26) - 2/bo + 6) + fixo)] 

(2) — fo + r A/o + — “ A 2 / 0 

Proceeding in this fashion, using polynomial curves of higher 
and higher order to approximate the graph of fix), one could 
derive a succession of interpolation formulas involving higher and 
higher differences of the tabulated function and providing in 
general higher and higher accuracy in the interpolated values. 
In this section we shall obtain several important interpolation 
formulas, though we shall derive them by methods more general 
than the geometric approach we have just suggested. 

Probably the most fundamental interpolation formula is 
Newton’s divided-difference formula: 

(3) fix) = fix o) + b “ Xo)fix 0 ,Xl) + b — Xo )b ~ Xi)fixo,Xl,xf) 

+ b — Xo )b — Xi) ■ ■ ■ (x — Xn-l)fiXo,Xi, . . . ,X n ) 
+ b ~ Xo )b *“ Xi) ■ • * b ~ X n )f(x,Xo,Xi, . . . ,X„) 

From this all the other interpolation formulas of interest to us 
can easily be derived by suitably specializing the points xo, 
Xi, , x„, which need not be regularly spaced or taken in 
consecutive order. For convenience in establishing (3) we shall 
restrict our discussion to some special, though adequately typical, 
value of n, say n — 2. Then, beginning with the third difference 

,&as) -/fow) 


X — Xi 



9 2 


FINITE DIFFERENCES 


CHAP. 4 


(4) 


(5) 



(6) 

(7) 

(8) 


( 9 ) 


and solving for /(t,t 0 , x i), we have 
f(x,xo,xi) = /(t 0 ,ti,t 2 ) + (x — X‘df(x,xo,xi,x 2 ) 

But f{x,x 0 ,xi) = 

and, substituting this into (4) and solving for /(t,t 0 ), we find 

f(x,X a) = f(x 0 ,Xi) -f (X — Xi)f(Xo,Xi,X 2 ) + (x — xi)(x - X-2)f(x,X 0 ,X h Xo) 

Finally, since f(x,x 0 ) = 

we have, on substituting this into (5) and solving for f(x), 
f(x) = /(.To) + (x - Xo)f(Xo,Xi) + (x — To) (t — Xi)f(X 0 ,Xi,X 2 ) 

+ (x — To) (x — Xi)(x — t 2 )/(t,t o,Ti,T 2 ) 
which is precisely Eq. (3) in the special case n — 2. The extension 
of the preceding argument to any value of n is obvious. 

The last term in (3) differs from the other terms in that the 
divided difference appearing in it contains x as one of its argu- 
ments and, hence, is not to be found among the entries in the 
difference table of /(t). For this reason the last term is usually 
referred to as the remainder after n + 1 terms or simply as the 
error term, and the interpolation series is often written in the form 
f(x) = p n (x) + r n+ i(x) 

where, of course, p n (x) is the nth-degree polynomial 

/(t 0 ) + (t - t 0 )/(t 0 ,Ti) + (t — t 0 )(t — xi)f(x 0 ,xi,x 2 ) + • 

+ (x — To)(t — Xl) • ■ ' (x — Xn-l)f(Xo,Xl, ... ,T„) 

and r n+Jl ( x) is the function 

(t — To)(t— Ti) * ' • (t — T„)/(T,To,Tl, . . . ,Xn) 

Using (6), (7), and (8), it is possible to obtain an interesting 
alternative expression for an nth divided difference and ulti- 
mately a somewhat more tractable form of the remainder term in 
(3). To do this, we observe that r„ +1 (T) vanishes at least n + 1 
times on the closed interval between the largest and smallest 
values of the set (t 0 ,Ti, * • • ,x n ), since, in fact, it vanishes when 
x — To, Xi, , . . , t„. Therefore, assuming that the necessary 
derivatives exist, it follows from Rolle’s theorem that r' n+1 (x) 
must vanish at least n times on this interval, r' n ' +1 ( x) must vanish 
at least n — 1 times on this interval, and, continuing in this 
fashion, r^ x (T) must vanish at least once on this interval. That is, 
there must exist at least one value of x, say x = £, between the 
largest and smallest values of the set (t 0 ,Ti, . . ,t„), such that 

r< n+i(£) — 0 Hence, differentiating (6) n times and evaluating the 
result for t — £, we have 

~ Pn (n) ti) = r 1 ^) = 0 

Now, from (7), the leading coefficient in the nth-degree poly- 


SEC. 4.2 


INTERPOLATION FORMULAS 


normal p n (x) mf(x Q ,xi, ... ,£«)■ Therefore, 

Pn (n) (£) = n\f(Xo,Xl, . . . ,X n ) 

and, hence, from (9) we have, for each value of n, 

f(.n) ( t) 

(10) f(xo,x h . . . ,x n ) = 

where £ is somewhere between the largest and smallest values of 
the set (xo,xi, ... ,x n ). 

Applying (10) to the (n + l)st divided difference appearing 
in the expression for r n +i(x) in- (8), we have, as an alternative 
form of r n+ i(x), 

(8a) r n +i(x) - (x - x 0 )(x -an) • ■ * (x - x n ) Jt 

where now £ is somewhere between the largest and smallest 
values of the set (a:,a:o,£i, . . . ,x n ). The error term r„ + i(;c) is of 
great importance in theoretical studies of the convergence of the 
interpolation series (3), but the difficulty of estimating the factor 

. . . ,*,) = 

often limits its usefulness in numerical work. Of course, if f(x) is 
a polynomial of degree m, say, its divided differences of order 
greater than m are all exactly zero, and, if we extend the series (3) 
sufficiently far, the error term will be zero. In our work we shall 
neglect the error term in (3) on the assumption that, eventually, 
the divided differences become exactly zero or at least negligibly 
small, and that the series is extended to this point. 


EXAMPLE 1 

Find /(2) from the following data: 


- 1.0 

0.0 

0.5 

1.0 

2.5 

3.0 


fix) 


3.000 

- 2.000 

-0.375 

3.000 

16.125 

19.000 



Ihe construction of the. difference table presents no problem, and, using Newton’s formula, 
with xo = 0, we can write at once 

/(2) = -2.000 + (2 - 0) (3.250) + (2 - 0)(2 - 0.5) (3.500) 

+ (2 - 0)(2 - 0.5) (2 - 1)(~ 1.000) 


12,000 



94 


FINITE DIFFERENCES 


CHAP. 4 


In passing, we note that the ordinary process of linear interpolation yields the value /( 2) «= 
13.750. 


(ID 


Closely associated with Newton’s divided-difference formula 
is Lagrange’s interpolation formula,* 


m . 


(x — Xi)(x — Xu) 


(x ~ Xn) 


(x 0 — Xi) (xo — Xz) ■ • • (x 0 — Xn) 


fix o) 


+ 


(x - Xo)(x - xi) 


(Xi - Xa)(xi - x 2 ) 

+ 


(X ~ Xn) 


(Xl ~ Xn) 


fix l) 


(x - Xo)(x - Xl) • • • ( X - Zn-i) „ ■ 
(ar. - x 0 )(xn - Xl) •••(*»- 


Like Newton's divided-difference formula, this formula provides 
the equation of a polynomial of degree n (or less) which takes on 
n + 1 prescribed functional values when x takes on the values 
xo, Xi, . . . , x n . Equation (11) can easily be derived from Eq. (3), 
but it is simpler merely to verify its properties. Clearly, it is a 
polynomial of degree » (or less), since each term on the right is a 
polynomial of degree n. Moreover, when x = xo, every fraction 
except the first vanishes because of the factor x — xo, and at the 
same time the first fraction reduces to 1, leaving just/(x) — f(x o), 
as required, when x = xo'. In the same way, when x = aq, every 
fraction except the second becomes zero, and we have /(a) — /(a i). 
Similarly, we can verify without difficulty that f(x) reduces to 
f(xf), f{xf), . . . , f(x n ) when a = xt,xa, • . • , a„, as required. 

When the points ao, xi, ... ,x n on which Newton's divided- 
difference formula is based are regularly spaced with tabular 
interval h, say, it is generally more convenient to express Formula 
(3) in terms of ordinary differences. To do this we observe that if 
a = ao + rh and a* = ao + kh 
then x — Xk = h(r — k) h = 0, 1 , 2 , . . . n 
and 

(12) (x — xo)(x — xi) • • • (a — xf) = h m r(r — 1) • * • (r — j) 

Also, from Eq. (6), Sec. 4.1, we have 

(13) ^ - (J T TW* 


( 14 ) 


Hence, substituting from (12) and (13) into (3), we find 
f(x) ss f(x o + rh) 


-fo + rAfo-h ^^ 1) A 2 / 0 + 


rjr — 


l)(r ~ 2) 

3! 


A 3 /o + * • * 


which is known as the forward Gregory-Newton interpolation 
formula, f Obviously this is a direct generalization of the formulas 
of linear and parabolic interpolation [Eqs. (1) and (2)]. Of course, 
the error term in (3) can be transformed into a corresponding 


* Named for the French mathematician Joseph Louis Lagrange (1736-1813). 
f Co-named for the Scottish mathematician James Gregory (1661-1708). 


SIC. 4.2 


INTERPOLATION FORMULAS 


95 


(15) 


(16) 

(17) 


error term for the series (14), but we shall leave .this as an 
exercise. 

For tables of limited extent, Formula (14) is especially- 
adapted to interpolation near the upper end, i.e., for smaller 
values of x, and cannot conveniently be used near the lower 
end. For the latter case it would be desirable to have a formula 
using differences located above rather than below the point of 
interpolation. Such a formula can easily be derived by choosing 
the points xo, xi, ... ,x n used in the divided-difference formula 
(3) to be the points 

xo, Xo — h, xo — 2 h, ... , xo — nh 
Then x — Xk = h(r -f- k) k — 0, 1, 2, . . . , n 
and 


(. x — xo) (x — .ti) ■ ■ • (x — Xj) = h m r(r 4 - 1 ) • • - (r +j) 
Moreover, in this case the typical difference 


f(Xo,X h . . . ,Xj+i) 

becomes f(x 0) x-i, . . . ,x-j~ i) 

and, from the symmetry of divided differences, the last expression 
is equal to 


f{X-j-i,X-j, ... ,x 0 ) 

Hence, using Eq. (7), Sec. 4.1 (with n ■ 
for our current choice of points, 


k — j + 1), we have, , 


/(* 0,Xl, 


,Xj+l) ■ 




O' + 1) !h J+1 

Hence, substituting from (15) and (16) into (3), we find 
f(x) m f(x o 4- rh) 

= fo + r A/_i 




+ 


r(r 4- l)(r 4- 2) 


3! 


- A 3 /-,, 4- 


which is known as the backward Gregory -Newton interpolation 
formula. 


EXAMPLE 2 

Compute /(l. 03) from the following data: 


X 

fix) A 

A* 

A 3 

1.00 

1.000000 




0.257625 



1.05 

1.257625 

0.015750 



0.273375 


0.000750 

1.10 

1.531000 

0.016500 



0.289875 


0.000750 

1.15 

1.820875 

0.017250 



0.307125 



1.20 

2,128000 




96 


FINITE DIFFERENCES 


CHAP. 4 


The construction of the difference table presents no difficulty, and we need merely identify 
— 1.00, h — 0.05, r — 0.6 and then substitute into Formula (14): 

/(1.03) =/[1.00 + (0.6) (0.05)] 


Linear interpolation uses only the first two terms of the last series and hence yields the (presuma- 
bly) less accurate value /(1.03) = 1.154575. 

There are various ways of obtaining central-difference 
interpolation formulas. For instance, we can choose the points 
used in Newton’s divided-difference formula in the following 
order: 

x 0 = xo xi — xo + h xz = xo — h Xz — Xo 4- 2 h Xi — xq — 2 h, . . . 

Then substituting into (3) and using Eq. (7), Sec. 4.1, to simplify 
the various divided differences, we find 

(18) f(x) m f( Xo + rh) 

- /. + r it. + A¥_ + f(r - 1 3 ) , (r + 1) A-/., 

+■!! »- Cf + lHr- 2) Ay _ >+ . . . 

or, introducing the central-difference operator 5 by means of the 
operational equivalence A = SE }i [Eq. (12), See. 4.1], 


f(xo + rh) — fa + r 5/^ -f 


Hr ~ 1) , (r + l)r(r -- 1) , 


+ (r + l)r(r - 1 )(r - 2) + . . . 

This is known as the forward Newton-Gauss interpolation 
formula. 


In exactly the same way, by choosing the points x 0> x h 
x 2 , ... in the order 

Xq = Xo xi = Xo — h x 2 = Xq h xz = Xo — 2 h Xi = xo + 2 h, . . . 

and again substituting into (3) we obtain, after introducing the 
central-difference notation, 

f(x 0 + rh) =/o + r dUi + 8% + 1} 

, (r + 2 )(r + 1) r(r - 1) .... 


(19) 


INTERPOLATION FORMULAS 


( 20 ) 


which is usually referred to as the backward Newton-Gauss 
interpolation formula. 

If we take the average of Eqs. (18a) and (19) we obtain a 
useful result known as Stirling’s interpolation formula :* 

n i 7 /• i r (5/j.a "b 8f—yf) i r~ 

fix o + rh) = /o + jj g ~ + 2! ■ 

y(r 2 - 1) (gfa + *y-n) , r 2 (r 2 - 1) . 

+ 3! 2 ~~ + 4! /0 ^ 


Another formula of considerable utility can be obtained by 
eliminating the differences of odd order from Eq. (18a) by means 
of the formulas 8fu. = ft — fo,8 s fy. — 8 2 fi — <5 2 /o, . . • .This gives 


/(.to + rh) - fa + r(fi — / 0 ) + 


rjr — 1) 


2! 


5 2 /o 


jr + 1 )r(r — 1) 


3! 


(6 2 /x ~ 8 2 /o) + 


(r + l)r(r - l)(r - 2 ) 


4! 


8 % 


(r + 2 )(r + l)r(r - l)(r - 2) 
5! 


(5 4 /i - W + 


or, collecting terms, 


fix o + r/L) = 


-(r- D/o 


r(r - l)(r - 2) 

3! 

<r 4- 1 ) rjr - l)(r - 2)(r - 3) 


5 2 /o 


5% 


+ rfi + 


(r + 1 )r(r — 1) 
3! 


5 2 /i 


+ ( r + 2)(r + l)r(r - l)(r - 2) ^ 


Finally, if we set 1 — r = s in the coefficients of the differences of 
fa, we obtain the symmetric form 


( 21 ) 


/(*. + rh) - sf, + J >/. + ^ ~ W- J) s</ , + 


+ rfj + 


r(r 2 - 1) 
3! 


8 2 fi + 


r(r 2 - 1) (r 2 - 4) 
5! 


S 4 ft + 


which is known as the Laplace -Everett interpolation formula. 


1 Establish Lq. (2) by finding the equation of the approximating parabola and evaluating 
it at x = xo + rh. (Hint: Take xa, x,, xz to be 0, h, 2h, respectively.) 


Named for the Scottish mathematician James Stirling (1692-1770). 


98 


FINITE DIFFERENCES 


CHAP. 4 


2 Obtain the error terms in the forward and backward Gregory-Newton formulas from the 
error term in Newton’s divided-difference formula. 

8 Compute (a) /(1.3) and (b) /(1.95) from the following data: 


X 

1.1 

1.2 

1.5 

1.7 

1.8 

2.0 

fix) 

1.112 

1.219 

1.636 

2.054 

2.323 

3.011 


4 Compute (a) y/50.2 and (b) V 55.9 from the following data: 


X 

Vi 

50 

7.07107 

51 

7.14143 

52 

7.21110 

53 

7.28011 

54 

7.34847 

55 

7.41620 

56 

7.48331 


6 Fit a polynomial of minimum degree to the data of Example 1. 
6 Fit a polynomial of minimum degree to the following data: 


x 

-1 

1 

2 

4 

5 

fix) j 

13 


13 

33 

67 


7 If y% y i, 2 / a , ?/3 are the values of a function at the equally spaced values Xn, xi, xi, x 3 , show 
that the best estimate of the value of y corresponding to the value of x midway between ar* 
and x« is 

?/i + yi , (yi + tit) — (yo + y») 

2 16 

8 Three readings are taken at equally spaced points x — 0, h, 2 h near the maximum (mini- 
mum) of a function y = f(x). Show that the abscissa of the maximum (minimum) is 
approximately 



and that the maximum (minimum) ordinate is approximately 

__ (Api + Ay n P 
fl 8 A*,vo 

9 Work Example 8, given that the three points where readings are taken are not equally 
spaced. 

10 Derive Lagrange’s interpolation formula by expanding 

m_ 

(x - Xo)(x ~ Xi) • • • (x - x„) 


into partial fractions. 


SEC. 4.3 


NUMERICAL INTEGRATION AND DIFFERENTIATION 


99 


4.3 

Numerical integration and differentiation 

Any of the interpolation formulas we obtained in the last section 
can be used to find the derivative of a tabular function. For 
instance, if we consider the forward Gregory-N ewton formula 

A*. + rh) - /. + r A/, + r(r ~ 1} A 3 /, + A 3 /, 

+ r(r-l)(r-2)(r-8) ,A </)+ . . . 


and differentiate with respect to r, we find 

(1) hf(x o + rh) = A/ 0 + A 2 / 0 + A 3 /o 

+ ^-9r 2 +nr-3 Ayo+ ,.. 

(2) A 2 /"(xn + rh) = A 2 /o + (r - 1) A 3 /„ + 6r * ~ a 4 /„ + • • • 

(3) hy"(x o + rA) = A 3 /o + A 4 / 0 + ‘ • 

(4) hyv(x o + rA) - A*/ 0 + • • • 


(5) 

( 6 ) 

(7) 

(8) 


(9) 

( 10 ) 

( 11 ) 

( 12 ) 


Specifically, if we put r - 0, we find for the successive derivatives 
at the tabular point x 0 

f'M = l Ufo - 5 A ’/» + i A y, - 1 A</. + • • ■) 

/"(*.) = p (a*/. - A 3 /o + j| AVo - • • •) 

f"M - js(av. -|a</„+ • • •) 

/"M = p (A‘/o - • • 


Similarly, from the backward Gregory-Newton formula we 
obtain 

VC* + A) - A/_j + A 3 /., + - 3 ^±f r -+- 2 A 3 /., 

2r 3 + 9r 2 + Hr + 3 . 

+ 12 fl/ " 4T 
A 2 /"(a;o + rA) = A 2 /_ 2 + (r + 1) A 3 /_* 

, 6r 2 + 18r + 11 
+ 12 

A 3 /'"(.To + rh) = A 3 f_3 + A */_1 + • • • 

WP'ix o + rA) = A‘>/_ 4 +.-•■■ 


A*/_* + 



TOO 


FINITE DIFFERENCES 


CHAP. 4 


and, at the point xo, 

(13) fW = i (if-i + 5 A*., + | A»/_, + 1 4*/^ + • ' ') 

(14) /"(*.) = i (a>/_, + 4»/_, + A 4 /_, + ■ ■ •) 

(15) S'" (xS) = i (a^, + | A 4 /-, + •,;■) 

(16) /"'(a) - ~ (A 4 /_4 + ■ ■ 0 

For a rigorous development, an error term analogous to 
Eq. ( 8 a), Sec. 4.2, should be found for any formula of numerical 
differentiation. This can be done, but the results are of relatively 
little use in routine calculations, and we shall not take them into 
account. However, it should be borne in mind that, unless we are 
dealing with a. polynomial, numerical differentiation may involve 
errors of considerable magnitude, the errors increasing signifi- 
cantly as derivatives of higher order are computed. 

example 1 

Find the first and second derivatives of \/ x at as = 2.5 from the table: 


X 

V 'x A 

A ! 

2.50 

2.55 

1 .58114 — ^ 

0.01573 • 

1.59687 

-0.00015 

2.60 

0.01558 

1.61245 

-0.00015 

2.65 

0.01543 

1.62788 

-0.00014 

2.70 

0.01529 

1.64317 

-0.00015 

2.75 j 

0.01514 

1.65831 



Using Eqs. (5) and (6) with a;# = 2.50 and h = 0.05, we find at once 

/'(2.5) - —— 0.01573 - i (-0.00015) - 0.3160 

0*05 2 J 

/"( 2.5) - — ^ (-0.00015) * - 0.0600 

The correct values to four decimal places are, of course, 

/'( 2.5) *= — - 0.3162 
2 V * j x=>2.6 

/"(2.5) = — = -0.0632 
4®V®U«.2.5 

To obtain formulas for numerical integration, it is con- 
venient to begin by considering the related problem of the 



SEC. 4.3 


NUMERICAL INTEGRATION AND DIFFERENTIATION 


101 


summation of series, a topic on which we touched briefly at the 
end of Sec. 4.1. In doing this it will be convenient to use certain 
additional operational equivalences which we shall now develop. 
We begin with Maclaurin’s expansion, 

7,2 7)3 

(17) f(x + h) - fix) + hfix) + r (*) + £]/'"(*) + • • • 

or, introducing the operators E and D =s d/dx, 

(18) Efix) = (l + hD + ~~ + ^ + • ■ •)/(*) 

Now, the series on the right is simply the expansion of the 
exponential e hD . Hence, we can write (18) in the form 

Efix ) = e hD f(x) 

from which we infer the operational equivalences 

(19) E = e hD 

(20) A == E - 1 = e hD - 1 

Next, we introduce the integration operator 
I fix) = j* hh fix)dx 

Then IDfix) = fix) dx = fix + h) - /(a) = A fix) 
and, if Fix) is any antiderivative oi fix), 

DIfix) = Z) fj +h fix) dx — D[F ix + h) - Fix)] 

= fix + h) - /(*) = A/(.r) 

Hence, D and / commute with each other, and we have the 
further equivalences 

(21) ID — DI — A 

We are now in a position to establish the famous Euler - 
Maclaurin summation formula : 

(22) | /, = i /;■ M dx + !(/.+ /.) + | M 

i=0 ' t=l v 

where the B’s are the Bernoulli numbers. I? 2 = £4 = ~Ho> 

... to be defined below. We begin by writing, with the aid of 
Eq. (21), 
h A fix) — hDIfix) 

or, replacing A by its equivalent from Eq. (20), 
h(e hD - l)fix) = hDIfix) 
or further, 

(23) hf(x) - im 


102 


FINITE DIFFERENCES 


CHAP. 4 


It is now necessary to expand the fractional operator 
hD/(e hD — 1) in a power series in hD. This can be done in various 
ways, but perhaps the simplest is to replace e hD by its series 
equivalent and then make use of the method of undetermined 
coefficients. Thus we have 


A , , n , h*D* , /i 3 D 3 , 
f 1 + AD + - 2T + -3T + 


)- 


,d„ + aihD + ~ h~D 2 + p. h*D« + 


' 2 ! 


a 3 i 


3l 


or, simplifying the fraction on the left and then clearing 
fractions, 


A , hD , h?D* . h 3 D % . 

= \ 1 + ~2\ + ~jr + ~w+' 


(do + aJiD + g h*D* + || h*D* + 


Now, multiplying the two series and equating the coefficients of 
like powers of hD on the two sides of this identity, we obtain the 
equations 


do — 1 2} + «i = 0 


no 1 , £2 

3! 2! 2! 


- 0 


= 0 
= 0 


from which we find without difficulty 
da — 1 di = — 0,2 = /■'§ da = 0 di = — Mo» • • ■ 

The function e x /(e x — 1) occurs in numerous applications, 
and the coefficients {a<} in its expansion have many interesting 
and important properties. These coefficients are ordinarily 
referred to as the Bernoulli numbers an d formulas have 
been developed which give them explicitly for any value of i. 
For our purposes we need to know only the numerical values of 
the first few B’s and the fact that after B x all B’s with odd sub- 
scripts are zero. Thus we can write 



hD h*D* _ h*D* 

2 * 12 720 i ' ' ' 


and hence, returning to Eq. (23), 

or, detaching the first term from the series and factoring hD from 


SEC. 4.3 


NUMERICAL INTEGRATION AND DIFFERENTIATION 


103 


the remaining terms, 

m = i [ 1 + hD | f (w>) M ] wi) 

= j //(*) + [ | § (ftD)*- 1 ] w/w 

(24) = i <fc + [ | ff A/to 

Now let us evaluate Eq. (24) for x — x 0 , xi, . , . , ar„~i and 
add the results, recalling that 
A/o + A/l + * * • + A/n-l = fn — fo ■ 

+ §<w»-*]4ft 


fn—l = | * + [ | fj (&D)*-'] A/n-l 

| /< = f /“/« '<**+{ X It tf " ~ w 

i = 0 ° *-^ = 1 J 

Since ^ and _B 3 = J5 S = Bi — • * • - 0, the last 

formula can be simplified somewhat by detaching the first term 
from the sum on the right-hand side and then setting i = 2 j in 
the rest of the series: 

" 2 ' Si - \ £M - 5 (/. - /.) + £ - /.>*->) 

Finally, if we add /„ to both members of this identity we obtain 
Formula (22), as required. 

If Eq. (22) is solved for the integral, we obtain 

(2S) dx = a|/,-|(/.+ /.) - | ^ Wn™- 11 - /• ,1, - 1> ) 

which is a fundamental formula of numerical integration. Equa- 
tion (25) is especially adapted to the integration of functions 
defined by analytic expressions which can conveniently be differ- 
entiated. For functions defined only by a table of values ,it is 
usually more convenient to have an integration formula in which 
the “correction terms” are expressed as differences rather than as 
derivatives. To obtain such a formula from Eq. (25), we need 
only replace the derivatives f, f" , ... by means of Eqs. (5), 
(7), . . . and the derivatives /', f", ■ . . by means of Eqs. 


104 


FINITE DIFFERENCES 


CHAP. 4 


(13), (15), .... This gives us 
j Xn f(x) dx = h + /l + ‘ • + fn-l + ^ 

+ m [p(a'/-*+|^/*-< + • - ) 

~ f( A ^° _ 2 A ** + )] 

“1“ 

(26) -*(§+/»+■•■■•• + A-1 + ^) - B (A/ ”-’ “ A/t) 

- ^ (AW* + A>/„) - — (A>/„_, - A*/,) 

-^(AY.-, + A W- ■ ■ • 

which is known as Gregory’s formula of numerical integration. 
In passing, we note that both (25) and (26) reduce to the well- 
known trapezoidal rule of integration if the correction terms are 
neglected. 


Compute f(x) dx for the function defined by the following table: 


X 

fix) A 

A®. 

A 8 

A 4 

0.0 

0.4698220 





0.0144778 




0.2 

0.4842998 

-0.0004670 




0.0140108 


0.0000290 


0.4 

0.4983106 

-0.0004380 


-0.0000024 


0.0135728 


0.0000266 


0.6 

0.5118834 

-0.0004114 


-0.0000023 


0.0131614 

. . : 

0.0000243 


0.8 

0.5250448 

-0.0003871 




0.0127743 

• 

i 



1.0 

0.5378191 





Using Eq. (26), we have at once 

jjfix) dx = * ^ ° -- 6 - | 8 " 2 ° + 0.4842998 + 0.4983106 + 0.5118834 + 0.5250448 + ^ 

- (0.0127743 - 0.0144778) - ~ (-0.0003871 - 0.0004670) 


19 

- ggjjjj (0.0000243 - 0.0000290) 
» 0.5047073 


SEC. 4.3 


NUMERICAL INTEGRATION AND DIFFERENTIATION 


105 


The integral in this problem is actually J \ log (2.95 + 0.5a:) dx, and its exact value is easily 
found to be 0.5047074, correct to seven decimal places. The approximate value is, therefore, in 
error by only 0.0000001. 

In many important applications it is necessary to compute a 
running integral of a tabular function, i.e., an integral of the form 

dx 

where x takes on successively each of the values at which f(x) is 
tabulated. For such a calculation the familiar trapezoidal rule: 

(27) j a f(x) dx h ^ + fl + h + ' * ■ + fn-2 + fn~l + 

is especially well adapted. For if to the given table we adjoin a 
column of the averages 

fo+fi fl+ft 

2 ’ 2 ’ 

the required integrals are precisely the sums of the entries in this 
column from the top down to each entry in turn, multiplied by h. 
Moreover, each sum can be found from the preceding one by 
adding to it the next average. Table 4.1 shows the computational 
pattern in detail. 

By recording each average in the cell above the one where it 
appears in this table and then summing the column of averages 
from the bottom upward, the process can also be adapted to the 
calculation of running integrals of the form 

dx 


table 4.1 


X 

f(x) 

Average 

I 

*2 = 

j xa f{x) dx 

Xa 

h 


2.-° 

ii 

f*°f(x) dx 

Xi 

fl 

/o + fl 

2 


*1-. 

fix) dx 

x° 

h 

fi +/a 

2 

L-L+^'-f+^+f 

ii 

fix) dx 


h 

/*+/* 

2 . 


*L- 

fix) dx 







108 


FINITE DIFFERENCES 


CHAP. 4 


IS Obtain the values of the c’s in the formula of Exercise 14 
a When n = 3. b When n = 4. 


4.4 

The numerical solution of differentia! equations 

One of the most important applications of finite differences is to 
the numerical solution of differential equations which, because of 
their complexity, cannot be solved by exact methods. Many 
procedures are available for doing this, * some of considerable 
generality, others especially adapted to equations of a particular 
form. Of the many methods which have been devised we shall 
present only the method of Milne and the Runge-Kutta method. 
These can be applied to simultaneous differential equations as 
well as to single equations of any order and are, therefore, ade- 
quate for almost any problem one is likely to encounter. 

The fundamental problem is to find the solution of the first- 
order differential equation 

U) g -/<'.*> 

which satisfies the initial condition y — y 0 when x — xq. We do 
not, of course, expect to find an equation for the solution. Instead, 
our object is merely to plot or tabulate the solution curve point 
by point, beginning at (a?o,2/o) and continuing thereafter at 
selected values of x, usually equally spaced, until the solution has 
been extended over the required range. 

To develop Milne’s method we begin with Eq. (1), Sec. 4.3, 
written in terms of y rather than/, and evaluate it for r — 1, 2, 3, 
and 4, getting 

yi " ^ (&ya + | AVo - | A 3 ?/0 + ^ A 4 2/o + * • ^ 

2/2 = \ (by* + | A 2 ?/o + ~ A 3 y« - ~ Ah/o + • ^ 

^ = i(A2/o + |A 2 yo + H A 3 yQ+ l A*ffo+ • • 

(a2/° + \ A 2 2/o + — A 3 y 0 + j|A 4 y Q + ■ ■ 

or, neglecting differences beyond the fourth and replacing the 
remaining differences by their equivalent expressions in terms of 


See, for instance, H. Levy and E. A. Baggott, “Numerical Studies in Differ- 
ential Equations,” vol. 1, C. A. Watts & Co., Ltd., London, 1934, and W. E. 
Milne, “Numerical Solutions of Differential Equations,” John Wiley & Sons, 
Inc., New York, 1953. 



THE NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 


109 


( 2 ) 


(3) 


(4) 


( 5 ) 


the successive functional values [Eqs. (15a), Bee. 4.1], 

Vi = (— 3 y 0 — I 0 i/x + I82/2 — 6 , 3 + , 4 ) 

y'i = ^ ( yo ~ Syi 4 8,a - Vi) 

y'z ~ j2h (~ yo 4 - 6,i — 18,2 + 10,3 + 3 , 4 ) 

V* - ( 3,o - 16,i + 36,2 - 48,3 4- 25 , 4 ) 

Now, if we subtract the second equation in the last set from 
twice the sum of the first and third equations and solve the result 
for ,4, we obtain 

,.1 - ,0 + -g- (2yi - ,£ + 2,' 3 ) 


or, in more general terms, 


,«+i = ,n— 3 4- -g- (2,'_ 2 — 4- 2,') 

If we know the values of , and y' down to and including their 
values at x n , Eq. (3) thus enables us to “reach out” one step 
further and compute ,„+ 1. With , n +i known, we can then return 
to the given differential equation (1) and compute y' n+ i. Then 
using Eq. (3) again, we can find ,„+ 2, and so on, step by step, 
until the solution has been extended over the desired range. All 
that remains is to devise a means of finding enough ,’s and y n s to 
get the process under way. 

One possibility is to begin the tabulation of , by expanding 
it in a Taylor series around the point x = xgi 

, = ,0 4 , o(x - xo) 4- ,0 - — 2! r- Vo 3J r • • 


The value of ,0 is, of course, given. The value of y[ can be found 
at once by substituting x 0 and ,0 into the given differential equa- 
tion (1). To find the second derivative we need only differentiate 
the given equation, getting 


y" = ¥ + fv' 

dx dy 


Since f(x,y) is a given function, its partial derivatives are known 
and become definite numbers when xo and , 0 are substituted into 
them. Moreover, the value of-,' at 0»o,,o) has already been found, 
and thus (5) furnishes the value of y". Similarly, differentiating 
(5) and evaluating the result at (xo,,o) will give y", and so on. 
In this way the first few terms of the expansion of , around the 
point (.ro,,o) can be constructed. In especially favorable cases the 
general term of the series (4) can be found and the region of con- 


110 


FINITE DIFFERENCES 


CHAP. 4 


( 6 ) 


vergence established. When this happens, (4) is the required 
solution, and we need look no further. In general, however, suc- 
cessive differentiation of f(z,y) becomes too complicated to con- 
tinue or the resulting series converges too slowly to be of prac- 
tical value, and we must fall back on Milne’s or some similar 
method. 

With (4) available as a representation of y in the neighbor- 
hood of x — x 0 , we can set z — xo + h = xi and calculate y i. 

Similarly, setting x = so + 2 h and x 0 + 3 h, we can find y 2 and 
2 /a. Then, substituting (x 1 , 2 / 1 ), (x^yz), and (x3,ys) into the given 
differential equation, we can compute y[, y' 2 , and y a without 
difficulty. With these values we are then in a position to begin the 
step-by-step solution of the differential equation by means of 
Eq. (3). 

From the preceding discussion it is clear that Eq. (3) is in 
general adequate for the step-by-step solution of y r = f(x,y). 

However, as a precaution against errors of various kinds, it is 
desirable to have a second, independent formula into which 
y n+ 1 can be substituted as a check. To obtain such an equation we 
return to (2) and add 4 times the third equation to the sum of the 
second and fourth and solve the resulting equation for y 4 , getting 

Vi = 2/2 + | (2/2 + 42/3 + 2 /D 

or, in more general terms, 

2/n+i = 2/n — 1 + | (2/Li + 4 ?/' + 2/Li) 

This formula cannot be used as a formula of extrapolation, since 
it involves y' n+1 , which cannot be found unless y n +i is already 
known. However, after y n+1 has been found by means of (3), 

2/n+i can be calculated, and enough information is then available 
to permit the use of (6). If the value of y n +i> as given by (6), 
agrees with the value found from (3), we are ready to move on to 
the calculation of y n + z . On the other hand, if the two values of 
2 /n+i do not agree, we must use the second value of y n +i to com- 
pute a new value of 2/«+d substitute these into (6), and continue 
the process until two successive values of y n +i are in agreement. 

When this happens, we are ready to continue the tabulation of y 
by returning to (3) and determining an initial estimate of y n +s. 

Formulas like (3), which express a new value exclusively in 
terms of quantities already found, are known as open formulas or 
predictor formulas. Those, like (6), which express a new value in ( 

terms of one or more additional new quantities and which, there- 
fore, can be used only for purposes of checking and refining are 
known as closed formulas or corrector formulas. 

The method of Milne is readily extended to the solution of 
simultaneous and higher-order equations. For instance, if we have 



C. 4.4 


THE NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 


111 


7) 


the two equations 

y' = Kx,y,z) and 2' = g(x,y,z) 

with the initial conditions y — yo, z = z n when x — x 0 , and if by 
independent means we have calculated (?/i, ^2,2/3), (21,22,23), and 
the related quantities {y[,y[,y[) and (2^2,23), then, using Eq. ( 3 ) 
and an identical version with 2 replacing y, we can compute 
?/4 and 24. After that, we can compute y\ and 24 from the differ- 
ential equations and again use ( 3 ) to obtain y 5 and 25, and so on as 
far as desired. Of course, the closed formula (6) can be used to 
check and correct both y n +i and z n +i if and when this is deemed 
necessary. 

The application of Milne’s method to equations of higher 
order is now immediate, since such an equation can always be 
replaced by a system of simultaneous first-order equations. For 
instance y" — g(x,y,y') is equivalent to the system 

?/ = 2 z' = g(x,y,z) 

which is just a special case, with /(x,y,z ) = 2, of the general 
problem of two simultaneous first-order equations. 

The Runge-Kutta method .differs from Milne’s method in 
several significant respects. In the first place, it requires no pre- 
determination of a set of starting values and, hence, is com- 
pletely self-contained. Second, it does not require the values of x 
at which the solution is being tabulated to be equally spaced; 
hence, the interval between successive values of x can be varied 
throughout the process, as time and accuracy may require. 

The Runge-Kutta method can be thought of as a generaliza- 
tion of the following extremely simple (and quite inaccurate) 
procedure : If one is given the first-order differential equation 


and the initial condition y — yo when x = xo, the value of y, say 
2 /i - yo + Ay, at £1 = xo + Ax can be approximated by using 
the usual differential estimate of the increment Ay, 

Ay = Ax = /(x 0 ,yo) Ax 

With this value for Ay available, an approximate value for 
yi = yo + Ay, namely, 

Vi = yo + ~ | Ax = yo +f(x 0 ,yo) Ax 

is determined, and the process can be repeated to obtain y%, 

yo, ■ . • • 

On the other hand, having a first approximation to y 1, one 
can compute dy/dx at the point (xi,yi) and then use the average 


in Eq. (7), to obtain a (presumably) more accurate value for y x , 
namely, 

» - » + \($ L + 1 L) ** - » ■ + ^ t ^ * 

before attempting to find y%. Or one can compute the value of 
dy/dx at the point 

/. Ax . Ay\ 

\ xo + T’ y° + Yj 

and use this instead of the derivative at (xo,yo) in Eq. (7) to 
obtain an improved value of y x , namely, 

, dy | . , , , / , Ax , f(xQ,y'o) Ax\ A 

yi==yo + Tx Ax = y ° + f V* 0 + T ’ yo + 2 ) Ax 

before continuing. The procedure based on Eq. (7) is known as 
Euler’s method ; that based on Eq. (8) is known as the modified 
Euler method ; and that based on Eq. (9) is known as Range’s 
method. 

In the Runge-Kutta method, three or more estimates of Ay 
are computed, and then the value of Ay which is finally used to 
determine y x is taken to be a linear combination of all of these. 
Specifically, in Kutta’s third-order approximation we let 

A iy as k t == f(x 0 ,y<>) Ax m f(x 0 ,yo)h 

Aty = k 2 - f(x o + p Ax,y 0 + p A x y) Ax s f(x 0 + ph,y 0 + pk x )h 
As y E= hz - f(x o + q Ax,y<> + r Aty + q —~r A x y) Ax 

= f(x o + qh,y 0 + rk 2 + q — rk x )h 

and then put 

Ay — aki -f bk 2 + ck s 


where a, 6, c, p, q, r are constants to be determined to ensure the 
highest possible accuracy in Ay. 

To do this, we must first expand k h k i} & 3 , and then Ay in 
terms of powers of Arc ss h, using implicit differentiation to 
compute the necessary derivatives. ITsing subscripts to indicate 
the partial derivatives of /evaluated at h = 0, i.e., letting 


fo = f(x 0 ,y 0 ), Si 


df(x,y) | : - _ df(x,y) 

dx Iso, j/o’ 2 dy 


fn = 


d 2 f(x,y) 
dx 2 


and using dki/dh and d 2 ki/dh 2 as abbreviations for dh/dh j^ o 
and d 2 ki/dh 2 


we thus have 


SEC. 4.4 


THE NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 


113 


fci = /oft 

*■ = {/. + (to + to f ) h + [/„p« + 2 Up’ § + /„p’ (£)’] | + ■ ' } * 

= foh + (/i + fifo)ph 2 + (/a + 2/12/0 + /22/0 2 ) ^~Jjr + ' • * 

fc - (/. + {/.? + a [ r + (« - 0 w ]} h 

+ {/n?» + 2/125 |V^ + (s - r) + to [>■ fr + (? - r) ^ ] 

, , r <**&*, / xd^n ^ , \ , 

+ ^[ r dA 2 + r) d^ 2 J) 2 + ‘ ‘ y A 

= /oft 4- (/1 4- /s/o)?ft 2 + [(/11 + 2/12/0 + /jis/o 2 )? 2 + 2/ 2 (/i 4- /s/o)pr] + • * • 
Hence, substituting, 

Ay = ak\ ~f- ft/c 2 4" efts 

= of oh + b [ M 4- (/1 4~ / 2 /o)pft 2 4* (/a 4- 2/12/0 4~/22/o 2 ) ^ — H ' ' ' j 
4- c {/oft 4- (/1 +hfo)qh* 4- [(/a 4- 2/12/0 4-/*foV 

+ JV.(/i+/^o)prJ^+ • • •} 

(10) = (a 4- ft 4 c)/oft 4- (bp 4- cq)(f l + f^W 

+ p — • ° q (/a 4 - 2/12/0 4- /22/0 2 ) 4" cpr/ 2 (/i 4" /2/0) j ft 3 4 - • • ' 

Now, since y' — f(x,y), we have, by Maclaurin’s expansion, 

a n , „ ft 2 , ft 3 , 

Ay — y- — y o = y Q h yf 4- y 0 gj + 1 * ‘ 

( 11 ) = /oft 4 - (/1 4-/2/°) 4 - [(/a 4 - 2/12/0 4 - /22/0 2 ) 4 -/ 2 (/i 4 -/^fo)] 4 - ■ ■ • 

Hence, A y as given by (10) will agree with the Maclaurin expan- 
sion of Ay, as given by (11), through terms in ft 3 = (Ax) 3 , provided 

bp 4- cq = | 

1 

We thus have four equations in the six unknown parameters 
a, b, c, p, q, r. The first three equations are linear in a, b, c and can 



114 


FINITE DIFFERENCES 


CHAP. 4 


easily be solved to express a, b, and c in terms of p and q. Then the 
fourth equation can be used to express r in terms of p and q also : 

_ 6 pq — d(p + q) + 2 , 2 — 3g 

6 pq 6 p(p - q) 

(13) e = 2 ~ 3 P r = g(g ~ P> 

6?(<? - p) p( 2 - 3p) 

Since p and q are arbitrary, we thus have a two-parameter family 
of formulas which can be used for the step-by-step solution of the 
equation y' — f(x,y) with an error which is of the order of 

(a = 

The following particular cases are worthy of note: 

(I) a -H b -= 0 c = % p = M q = r = % 

Ay ~ }i(fa + 3fc«) 

where fa — f(x 0 ,ya)h 

fa — f(x o + %h, yo + i)h 
fa — f(%o + y'ih, yo -f- %fa)h 

(II) a = y A 6 = c = % p = q = r = % 

Ay - H(2fa + Sfa + Zfa) 

where fa = f(xo,yo)h 

fa = f(x o 4- %h, y 0 + %fa)h 
fa = f(x o + %h, yo + %fa)h 

The values of the parameters in case (II) cannot be obtained 
from Eqs. (13), since p = q, but can be checked directly in Eqs. 
( 12 ). 

The foregoing analysis can be extended without difficulty to 
yield step-by-step solution procedures in which the error is of the 
order of (Ax) r> = h s . In particular, the following two sets of 
formulas are quite useful: 

(III) Ay = H(fa + 2fa + 2fa + fa) 

where fa = f(x 0 ,yo)h 

fa — f(x o -f- yo + y.ki)h 
fa — f (& o + yh, yo + yfa)h 
hi — f(x o + h, yo + fa)h 

(IV) Ay - %(fa -f dfa + Zfa + fa) 

where fa = f(x 0 ,yo)h 

fa — ffa o + Mfa yo -f }4fa)h 
fa — f(x o + %h, y 0 + fa — }ifa)h 
fa = f(x 0 + h, y 0 + fa — *2 + fa )h 
The solution process based on (III) is often referred to specifically 

as the Runge-Kutta method. 



SEC. 4.4 


THE NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 


m 


Any of the Runge-Kutta formulas can be used to solve 
simultaneous and, hence, higher-order differential equations. For 
instance, using (III), we can tabulate the solution of the system 
of equations 

g=/0w) £ = s(w> 

at intervals of Ax == h by computing 
Ai y = fa = f(x 0 ,yo,zo)h 
Ai z sli = g(x 0 ) yo,zo)h 
Ai y = fa = f(x o + %h, 2/0 + M&b 2 o + 

Ai z = h = g(xo + j/o + %fa , zq + } 4 h)h 

A zy m fa = f(x o + }ih, y 0 + Ufa, z 0 + %h)h 

A 3 z s lz — g(x o + %h, y 3 + Ufa, 2o + M/ 2 )^ 

A4 y == /C4 = f(x 0 +• /t, 2/0 + fa, zo + h)h 
A 4 3 S U ~ g(x 0 + h, 2/0 + /C 3 , 2o + ^ 3 )^ 
and then using the formulas 
Ay = M(/ C i 4" 2^2 + 2&3 + A 14 ) 

A# = }4(h + 2/2 4- 2/3 -|- U) 

If the various increments are computed in the indicated order, 
each involves only quantities which have previously been 
calculated. 

example 1 

Tabulate the solution of y‘ — a; s + y at intervals of h = 0.1 if y — — 1 when x «=» 0. 

Using the Runge-Kutta formulas (III) for the first increment, we have 

ki = -0.1000 h = -0.1048 k 3 = -0.1050 fc 4 = -0.1095 

and A y - —0.1048 

Hence, y, — yt, + Ay = —1.1048 

«• \ 'o' '* o4£) 

For the second increment', we have/ similarly, 

k, = -0.1095 = -0.1137 h = -0.1139 & 4 = -0.1179 

and Ay — —0.1138 

Hence, y 3 — yi + Ay == —1.2186 

For the third increment, we have 

fti = -0.1179 k 3 = -0.1215 fca = -0.1217 /c 4 = -0.1250 

and Ay =■' —0.1216 

Hence, 1/3 = ?/a + A?/ = " — 1.3402 

This process can, of course, be continued as far as desired. However, we shall calculate just 
one more value of y, this time using Milne’s method, which can now be applied since we have 
values for ys,, y u y 3 , and y a . To do this we must first compute y„, y„ y v y% from the differential 



116 


finite differences 


CHAP. 4 


equation, getting 

y’ 0 = -1.000 y[ = —1.0948 y' = -1.1786 ^==-1.2502 

With these values, we can now use the open formula, Eq. (3), to obtain 
2/4 = -1.4682 

Using this value, we find from the differential equation that 
y\ = -1.3082 

With this we can use Eq. (3) again to find 2/5, or we can first use the closed formula, Eq. (6), to 
check 2/4 before continuing. In this case Eq. ( 6 ) also gives us 2/4 = — 1.46S2, which is a good check 
on the accuracy of our calculations. 

The differential equation y' = x* 4- y is so simple that it can be solved exactly without 
recourse to numerical methods, and by the methods of Chap. 1 (or Chap. 2) we find at once 
that the required solution is 

y s e x _ x 2 _ 2x — 2 

For x — 1, 2, 3, 4 this gives us the correct values 

yt = -1.1048 2 / 2 = -1.2186 y 3 - -1.3401 3/4 - -1.4682 

The values we computed for 2 / 1 , 2 / 2 , and y 4 agree with these to four decimal places, and the value 
we computed for y 3 differs from the correct value by only 1 in the fourth place. 


EXERCISES 

1 Using the Runge-Kutta method (III), find y 3 and 2/9 in Example 1 without finding 2/5, 2/7, 
or 2 /s- How do these values compare with the correct values? 

2 Using Kutta’s third-order approximation (I), tabulate the solution of the equation 
y' = * — y at intervals of h = 0.1 if y = 1 when x = 1. 

3 Using Kutta’s third-order approximation (II), tabulate the solution of the equation 
y' = x + y at intervals of h = 0.1 if y = 1 when x - 0. 

4 Using Milne’s method, tabulate the solution of the equation y' — x + y at intervals of 
h — 0.1 if y = 1 when x — 0. 

5 Using the Runge-Kutta method (III), find y 4 and z 4 for the solution of the system 


dy 

dx 


x + z 


dz 

dx 


x - y 


given h = 0.1, and y -» 0, 2 = 1 when x = 0. 

6 Using Milne’s method, tabulate the solution of the system 


■ x s + yz 


at intervals of h = 0.1, given y 0 = 0, z 0 = 1. 

7 Work Exercise 2 using the open formula 

Vn+i — Vn + h(y' n + }% A 7/'_j + %2 A 2 2/'_ 2 + % Ah i' n _ 3 + 2S H : 20 A'y'^) 

and the closed formula 

y n +i = Vn + h(y' n+l - y 2 A y n - l { 2 Ahj n _ x - A h / n _ 2 - 1 ^ 2 o A ! t/^_ 3 ) 
(These equations constitute the so-called Adams-Bashforth method for the numerical 
solution of differential equations.) 

8 Explain how the Adams-Bashforth method described in Exercise 7 can he extended to 
systems of differential equations and equations of higher order. 

9 Eliminate the differences from the formulas of Exercise 7, and express y n+x directly in 
terms of y n and the various values of y'. 


SEC. 4.5 


DIFFERENCE EQUATIONS 


117 


10 Work Exercise 3 using the open formula 

y n +l = Vn + h( 2 y i2 y' n - l % 2 Vn-\ + HiVn-i) 
and the closed formula 

y n +l = y n + WHzy'n+l + Hit/h ~~ HuVn-i) 

(These equations constitute the so-called Adams-Moulton method for the numerical solu- 
tion of differential equations.) 

11 Using the open formula 

Un+i = 2y„ - j/n-i + hHy" + H 2 AVn-2) 

and the closed formula 

2/n+l = 2 J/n - 2/n-l + k S (5/» + H 2 ^Vn-l) 

tabulate the solution of the equation y" — x + y at intervals of Ax — h = 0.1, given 
y 0 = i, = 0. How do your results compare with the exact solution? 

12 Set up the Kutta third-order approximation corresponding to the values p = 34, q — 1, 
and show that it reduces to Simpson’s rule when f(x,y) is independent of y. 

13 By expanding each term in Eq. (3) around the point x - x„_ 3 , show that the principal 
part of the error in Milne’s open formula is 'HshVn- a- what is the principal part of the 
error in Milne’s closed formula? 

14 Find the equation of the polynomial of minimum degree for which y and y' take on pre- 
scribed values (y^A) and (y h y[) at x = 0 and x = h. What is the value of y, given by this 
polynomial? How might this result be used to carry out the step-by-step integration of a 
differential equation of the form ?/' = /(*,*)? How might an accompanying closed formula 
be obtained? 

15 Find the equation of the polynomial of minimum degree for which y and y" take on pre- 
scribed values (y m y") and (y u y") at x = 0 and x - h. What is the value of y t given by this 
polynomial? How might this result be used to carry out the step-by-step integration of a 
differential equation of the form y" - /(*,?/)? How might an accompanying closed formula 
be obtained? 


4.S 

Difference equations 

The many similarities we have already observed between the 
calculus of finite differences and the ordinary, or infinitesimal, 
calculus suggest that there should be a theory of difference equa- 
tions roughly paralleling the theory of differential equations ; and 
this is indeed the case. However, in the study of difference equa- 
tions we do not ordinarily consider equations of the form 

(1) /(A)y = 4>{x) 

as might be expected by analogy with the differential equation 

(2) f(D)y = <f>(x) 

but rather equations of the form 

(3) f(E)y = <Kx) 


FINITE DIFFERENCES 


CHAP. 4 


This, of course, is simply a matter of notational convenience, 
since, using the operational equivalence A — E — 1, any function 
of A can be transformed at once into a function of E, and vice 
versa. In this section we shall restrict ourselves to the case of a 
single linear, constant-coefficient difference equation 

( aoE r + aiE r ~ 1 + • • • + a r —iE -f- a r )y — 4>(x) 

where 4>{x) is a linear combination of terms or products of terms 
from the set 

k x cos kx sin kx k a constant 
and x n n a nonnegative integer 

Since the substitution t — hx will transform a function of t 
tabulated at intervals of h into a function of x tabulated at unit 
intervals, it is clearly no restriction to assume h = 1, so that 
invariably Ef(x) ~ f(x + 1), and we shall do this throughout the 
present section. We shall base our solution of Eq. (4) primarily 
on analogy with linear, constant-coefficient differential equations, 
and such theoretical results as we may need we shall merely 
quote without proof. 

In Eq. (4), if both a Q and a r are different from zero, as we 
shall henceforth suppose, the positive integer r is called the order 
of the equation. If 4>(x) is identically zero, Eq. (4) is said to be 
homogeneous; if is not identically zero, Eq. (4) is said to be 
nonhomogeneous. By a solution of (4) we mean a function of x 
with the property that, when it is substituted into (4), it reduces 
the equation to an identity. From a theoretical point of view both 
x and y should be regarded as continuous variables related by 
Eq. (4) on a set of equally spaced values of x. However, in prac- 
tical problems we are almost always interested in y only for the 
discrete values $ = . . . , — 3, —2, —1, 0, 1, 2, 3, . . . , and in 
our work we shall attempt no more than the determination of 
solutions defined on this range. 

For the second-order linear difference equation, with either 
variable or constant coefficients, we have three theorems com- 
pletely analogous to the fundamental theorems of Sec. 2.1 : 

THEOREM 1 

If yi(x) and y 2 (x) are any two solutions of the homogeneous equation 
(aoE” -(- aiE -j- a^)y = 0 

then ciyi(x ) + c 2 yz(x), where d and c 2 are arbitrary constants, is also a solution. 
THEOREM 2 

If yi(x) and y%{x) are two solutions of the homogeneous equation 
(aoE 2 -j- aiE -j- ai)y — 0 




SEC. 4.5 


DIFFERENCE EQUATIONS 


119 


for which 


C[yi(x), y 2 (x )] f = 


yifr) 

Eyi(x) 


y*(«) 

Ey z (x) 


=* 0 


then any solution ya(x) of the homogeneous equation can be written in the form 
yz(x) — ciyi(x) + c?y»(x), where ci and c 2 are suitable constants. 


As a consequence of Theorem 2, the expression cnj^x) 4- c$ji{x) 
is called a complete solution of the homogeneous equation when 
the particular solutions yi(x) and y 2 (x) satisfy the condition 

C[yi(x), y 2 (x)] 5* 0 


THEOREM 3 

If F(x) is any solution of the nonhomogeneous equation 
( aoE 2 + aiE -f a 2 )y = <j>(x) 

and if ctyi(x) + c 2 y 2 (x) is a complete solution of the homogeneous equation 
obtained from this by deleting the term then 

V - Ciyiix) +• czysix) + Y(x) 
is a complete solution of the nonhomogeneous equation. 


As in the theory of differential equations, any complete solution 
of the related homogeneous equation is usually called a com- 
plementary function of the nonhomogeneous equation. The 
extension of these theorems to difference equations of order 
greater than 2 is obvious. 

To find particular solutions of the homogeneous equation 

(5) (do E 2 -f- <i\E a 2 )y = 0 

when the coefficients do, at, o 2 are constants, we might try, as 
with the analogous differential equation, 

(6) y = e mx 

However, it is more convenient to assume 

(7) y - M* 

which is clearly equivalent to (6) with M — e m . Substituting this 
into (5), recalling our agreement that Ef(x) — f(x 4-1), we 
obtain 

d 0 M x+Z 4- a\M x + ' 1 4- aiM x = 0 
or, dividing out M x , 

(B) cioM a 4~ &xM 4" u 2 — 0 


t The function (%i(x), y 2 (x)] is customarily referred to as Casorati’s deter- 
minant, after the Italian mathematician Felica Casorati (1835-1890). Its 
resemblance to the Wronskian W[yi(x), yt(x)] (see Sec. 2.1) is apparent. 


120 


FINITE DIFFERENCES 


CHAP. 4 


(9) 


( 10 ) 


Naturally enough, this is called the characteristic equation of the 
difference equation (5). 

If the roots Mi and Mi of (8) are distinct, then 


= Mi*Mf +l - MfM i*+ l = Mi*Mf{M t - Mi) Of 
and, hence, by Theorem 2, a complete solution of Eq. (5) is 
y = ciMi x + c 2 M 2 K 

If Mi and M% are real, this is a completely acceptable form of the 
solution. However, if Mi and M% are complex, then (9) is incon- 
venient for most purposes, and it is desirable that we reduce it to 
a more useful form. To do this, let the roots be 

M h Mi = p ± iq — re 

where r = Vp 2 + 2 s and tan 6 — 

Then we can write 


y = ci(re i$ ) x + c 2 (re~ iS ) x 
= r x (cie iex + Cie~ iix ) 

— r x [ci(cos Ox + i sin Ox) + c 2 (cos Ox — i sin 6 x)] 

= r x [(a + c 2 ) cos Ox + i(ci — c 2 ) sin Ox] 
or, renaming the constants, 
y — r x (A cos Ox' + B sin Ox) 

If Mi - M%, clearly C(M i x ,M 2 ®) —0, and we must find a 
second, independent solution before we can construct the com- 
plete solution of (5). Again by analogy with differential equations, 
we are led to try 

y — xM i* 

and we find by direct substitution that this is indeed a solution 
when the characteristic equation (8) has equal roots. For we have 

a 0 (x + 2)Mi*+ 2 + ai(x + l)M t *+ l + a&M? 

- xMi*(a Q Mi 2 + aiMi + a 2 ) + M 1 x + 1 (2a 0 M : + ai ) = 0 

since the coefficient of xM f vanishes, because in any case Mi 
satisfies the characteristic equation (8); and the coefficient of 
Mi x+1 vanishes, because when the characteristic equation has 
equal roots their common value is Mi = — ai/2o 0 . Moreover, for 
the solutions Mi x and xMt x we have 


C(Mi*,xMi x ) - Mi*(x + l)Mi* +1 - xMi s Mi x+1 = M i 2x+1 9 * 0 
Hence, according to Theorem 2, a complete solution when the 


t Since a 2 ^ 0, or else the difference equation would be of order less than 2, 
contrary to hypothesis, it is clear that neither M, nor M 2 can be zero, 
t For a discussion of the exponential form of a complex number, see Sec. 14.7. 


SEC. 4.5 


DIFFERENCE EQUATIONS 


121 


(ID 


characteristic equation has equal roots is 
y = ciMi x + c 2 x.Mi x 


The results of the preceding discussion are summarized in Table 
4.2. 


table 4.2 


Difference equation ( a a E 2 + a x E + ai)y = 0 an, a 2 0 
Characteristic equation ooM 2 + a,M + a 2 = 0 


Nature of the 
roots of the 
characteristic 
equation 

Condition on the 
coefficients of the 
characteristic equation 

Complete solution of the 
difference equation 

Real and unequal 

ad — 4aoa 2 > 0 

y — CiM i* + ciMi x 

Mi * Mi 



Real and equal 

ad — 4ao<Zi = 0 

y = ciMi x + cixM i x 

Mi = Mi 



Conjugate complex 
Mi = p + iq 

Mi — p — iq 

ad — 4aoa 2 < 0 

y = r x (A cos 6x + B sin Ox) 
r = vV + 
tan 6 = q/p 


EXAMPLE 1 


Find a complete solution of the difference equation (E 2 + 2 E + 4 )y <= 0. 

The characteristic equation in this case is M* 4- 2 M +4 = 0, and its roots are Mi, 
M 2 - —1 ± i s/s. Since 


r • V(-l) 8 + (VS) 2 = 2 and 
we have as a complete solution 


( 2m; . 2m\ 

5 2 X 1 A cos — — + B sin J 


0 = tan - 


i a/3 2* 

l TT “ 3 


To solve the nonhomogeneous equation 
(12) (a,Q E 2 + aiE + a 2 )y = <f>(x) 


we must, according to Theorem 3, add a particular solution of 
(12) to a complete solution of the related homogeneous equa- 
tion (5). To find the necessary particular solution Y, we use the 
method of undetermined coefficients, starting with an arbitrary 
linear combination of all the independent terms which arise from 
4>(x) by repeatedly applying the operator E. As in the case of 
differential equations, if any term in the initial choice for Y 
duplicates a term in the complementary function, it and all 
associated terms must be multiplied by x until duplication is 
eliminated. The procedure is summarized in Table 4.3. 


122 


FINITE DIFFERENCES 


CHAP. 4 


table 4.3 


Difference equation (o 0 £ 4 + a t E + ai)y — <l>{z) 


<M®)* 

Necessary choice for particular solution Y t 

1. a (constant) 

A 

2. ax k 

(k a positive 
integer) 

A<& k + A\x k ~ 1 +••■•+ Ak-iX + Ak 

3. ak x 

Ak x 

4. a cos kx 

A cos kx + B sin kx 

5. 0 sin kx 

6. ax k l x cos nix 

(A e x k + • . • + Ak-ix + Ak)l x cos mx 

+ (B 0 x k + • • • + Bk-ix + Bk)l x sin mx 

7. ax k l * sin mx 


* When <f>( x) consists of a sum of several terms, the appropriate choice for Y 
is the sum of the Y expressions corresponding to these terms individually, 
f Whenever a term in any of the F's listed in this column duplicates a term 
already in the complementary function, all terms in that Y must be multi- 
plied by the lowest positive integral power of x sufficient to eliminate the 
duplication. 

EXAMPLE 2 

Find a complete solution of the difference equation (E 2 — 5 E + 6 )y = x + 2*. 

The characteristic equation in this case is M 2 — 5 M + 6 = 0, and from its roots M\ ~ 2, 
M 2 = 3, we can immediately construct the complementary function y = ci2* + c»3*. For a 
particular solution we would ordinarily try Y = Ax + B + C2 X . However, it is clear that C2 X 
duplicates a term in the complementary function. Hence, we must multiply C2 X by is before 
incorporating it in our choice for Y. Thus we substitute Y = Ax + B + Cx'2 x into the difference 
equation, getting 

Ufa + 2) + B + C(x + 2)2*+ 4 J - 5[A(x + 1) + B + C(x + l)2*+i] 

+ 6[Aa: + B + Cx2 x ] - x + 2* 


or 2 Ax + (-3 A + 2 B) - 2C2* - x + 2* 

which will be an identity if and only if A = B — %, and C 
therefore, 


x2 x 

2 


— y%, A complete solution is, 


EXAMPLE 3 

Find the sum of the series 

l %k* k 1 

Clearly, s satisfies the first-order difference equation 
S n +1 -S,s(l - l)s„ = (n + 1 )i»+i 


DIFFERENCE EQUATIONS 


The characteristic equation here is M - 1 = 0, and so the complementary function is simply 
s = ci (1)" — Ci. To find a particular integral, we assume 


S = (an + b)k n+l 
Then, substituting, we must have 

[a(n + I) + b]k n+ - — (an + b)k n+1 = (n + l)k n+l 
or, dividing out k n+1 and collecting terms, 

n(ak - a) + (ak + bk — b) = n + 1 
This will be an identity in the variable n if and only if 

a(k - 1) = 1 and ak + b(k — 1) - 1 

1 i 

and b = — - 


- 1 


(k - 1)» 


Hence , 


and a complete solution is 


(k - l) s 

To determine ci we use the obvious fact that when n = 1, s = k, Thus we must have 


(k - 1)» 


Hence, finally, 


- * + [»(*-!)- life "* 1 
(k-iy 


EXAMPLE 4 

In the system shown in Fig. 4.2a the point Pa is kept at the constant potential Va with respect 
to the ground. What is the potential at each of the points Pi, P 2 , . . . , P„_i? 

According to KirchhofPs first law, the sum of the currents flowing toward any junction in a 
network must equal the sum of the currents flowing away from that junction. Hence, at a general 
point P * + 1 (Fig. 4. 2b) we have 

4 = U + 1 + /*+. 

or, replacing each current by its equivalent according to Ohm’s law, 

F* - F. +l __ V x+ r - V x+l 

r r 2r 

or, finally, 

( 13 ) V x+ -j — ${V x+ i + V x = 0 

This equation holds for a; = 2, . . . , n — 2, that is, at all but the points Pi and P n -i, where 
we have the respective conditions 

-Yi + HV i = Fo 

-Hv n -i + f „_ 2 = o 


(14 j 
(15) 


since F 0 is given 
since F„ = 0 


124 


FINITE DIFFERENCES 


CHAP. 4 


Equations (13), (14), and (15) constitute a system of n — 1 linear equations from which 
the unknown potentials Fi, F», . , , , F*_v can be found by completely elementary though very 
tedious steps for any particular value of n. However, it is much simpler and more elegant to 
regard Eq. (13) as a second-order difference equation, subject to the end conditions (14) and 
(15), which will serve to determine the values of the arbitrary constants appearing in any com- 
plete solution of (13). 

Talcing this point of view, we first set up the characteristic equation of Eq. (13) : 


M i - KM + 1=0 

From its roots M i - H and M 2 = 2, w*e then construct a complete solution of Eq. (13), namely, 
F* « A(M)* + B‘2* 

Substituting this into Eqs. (14) and (15), we have 


-(f +4B ) + l(l +2B )- y " 

■ l ( f ; + B2 "*‘) + 3^5 + B2 "" 


from which we find at once 

2«n 


The final solution is, therefore, 


\ 2 * ) 2 * - 1 


That this reduces to Vo when x - 0 and reduces to 0 when x 


is easily verified. 



FIGURE 4.2 

A ladder-type network with identical loops. (Although the network shown in Fig. 4.2a 
appears to contain exactly seven loops, the number of loops is actually indefinite. This is im- 
plied by the fact that the central portion of the figure is drawn with lighter lines; this conven- 
tion will be used throughout the book to suggest a configuration of indefinite extent.) 


difference equations 


EXERCISES 

1 Find a complete solution of each of the following equations: 


a ( E 2 + 7E + 12 )y = 0 
c (E~ + 2E + 2)y = 0 


b (F 2 + QE + 9 )y = 0 
d (A 2 - 3A + 2)y = 0 


2 Find a complete solution of each of the following equations: 

“ g'-fi-e)!!-*’ b (iE’-tt + Dj-z + z+a- 

i+L"-” 8 * 4 (A ! + 64 + IS), - 2- 

e (E -3E + 2)y = 2* + 2~* f (F 2 - 4F + 4)y = 2» 

3 Find a complete solution of each of the following equations: 

a (F 2 - E - 6)7, - * + 3- b (F 2 + !)„ = sin * 

4 Find a complete solution of each of the following equations: 

a (F s - 6 F 2 + UE - Q)y - o b (F 4 - 16)y = * + 3* 

c (F 4 + 10F 2 + 9) ?/ = 0 d (E i + SF 2 - Q) y = 5 

5 Show that the difference equation (F 2 - 2XF + l)y - 0 has the indicated solution in each 
ol the following special cases: 

X < —l y = A(-l)* cosh y.x + F(~ 1.)* sinh nx cosh n = -x 

X = -1 

— ! <C X C i y = 4 cos tix + B sin yx cos ^ = X 

X = 1 y = A +Bx 

1 < x y <=> A cosh me + F sinh nx cosh 11 « x 

6 Work Example 4 with both P, and P„ maintained at the constant potential F 0 . 

klw Example 4 glven that the co mmon value of the resistances in the vertical branches 

8 A system consists of n spring-connected masses, as shown in Fig. 4.3. What is the displace- 
ment of each mass from its original position when the system is again in equilibrium after 
a force F 0 is applied to the right-hand end? 


_ N \k 
j J-'W'-j” 


(See explanation of convention used for Fig. 4.2.) 
9 Show that the nth-order determinant 


0 0 0 
0 0 0 


X l 
1 X 


126 


FINITE DIFFERENCES 


CHAP. 4 


satisfies the difference equation (E 1 — \E + 1)Z) = 0. Hence show that, when X > 2, 


D n 


sinh ( n + 1)^ 
sinh n 


where cosh /i — ~ 


What is D n if X = 2? -2 < X < 2? X = -2? X < -2? 

10 If yi (i) and t/ 2 (x) are any two solutions of the general linear second-order difference equation 
[ao(x)E s + ai(x)E + a 2 (x)]y = 0, show that Casorati’s determinant C[yi(x),y 2 (r)] satisfies 
the relation [ao(z)E - aalxllC = 0. [Hint: Write down the conditions that both yi(x) and 
y a (x) satisfy the given equation; then eliminate the terms in Eyi(x) and Eyaix).] 

11 Prove Theorem 1. 

12 Prove Theorem 2 in the special case where the coefficients are constants. (Hint: Recall 
the proof of Theorem 2, Sec. 2.1, and use the result of Exercise 10.) 

13 Prove Theorem 3. 

14 Show that the integral — f — — — ----- dt satisfies the equation 

JO cos t — cos X 

[E* - (2 cos \)E + lj/ = 0 

Solve this equation, and find an explicit expression for I n . 

15 Discuss the solution of each of the following equations: 


a Ey = 0 b Ey — 4>{x) 

c (aoE 1 + a\E)y = 0 d (aoE i + «i E)y — <6(x) 


The method of leost squares 

The problem of curve fitting admits of two somewhat different 
interpretations. In the first place, we may ask for the equation of 
a curve of prescribed type which passes exactly through each 
point of a given set. For polynomial curves this is most easily 
accomplished by means of interpolation formulas such as we 
developed in Sec. 4.2. On the other hand, we may weaken these 
requirements and ask for some simpler curve whose equation 
contains too few parameters to permit it to pass exactly through 
each given point but which comes “as close as possible” to each 
point. For instance, given a set of points as in Fig. 4.4a, a straight 
line passing as close as possible to each point may very well be 
more useful than some complicated curve passing exactly through 




! (*i. yi) 


FIGURE 4.4 


* 

y ( ) 

The approximate 

N 



fitting of a 



V 

straight line to a 



i (a + bxi f 

set of points. 



J 


i. .. 

X; 


SEC 4.6 


THE METHOD OF IEAST SQUARES 


127 


each point. This will certainly be the case with experimental data 
which theoretically should fall along a straight line but which fail 
to do so because of errors of observation. The necessary measure 
of “as close as possible” is almost universally taken to be the 
least-square criterion,* and the process of applying this criterion 
is known as the method of least squares, which we shall now 
develop. 

Let us begin by supposing that we wish to fit a straight line l 
whose equation is 

(1) y = a + bx 

to the n points (xi,yi), {x 2) yf), • • • , (x, h y n ). Since two points 
completely determine a straight line, it will in general be impossi- 
ble for the required line to pass through more than two of the 
given points, and it may not pass through any. Hence, the 
coordinates of the general point will not satisfy Eq. (1). 

That is, when we substitute Xi into Eq. (1), we get, not yi, but 
rather the ordinate of l, which, as we see in Fig. 4.46, differs from 
y% by hi. In other words, 

(2) y% — (a + bxi) — hi 9* 0 

If we compute the discrepancy Si for each point of the set 
and form the sum of the squares of these quantities (in order to 
prevent large positive and large negative <5’s canceling each other 
and thereby giving an unwarranted impression of accuracy), we 
obtain 

(3) E = ^ Si 2 = (yi — a — bxi ) 2 + (y 2 — a — bxf) 2 + ■ • • 

(y n ~ a — bx n ) 2 

The quantity E is obviously a measure of how well the line l fits 
the set of points as a whole. For E will be zero if and only if each 
of the points lies on l, and the larger E is, the farther the points 
are, on the average, from l. The least-square criterion is now 
simply this: that the parameters a and, b should be chosen so as to 
make the sum of the squares of the deviations E as small as possible. 

To do this, we apply the usual conditions for minimizing a 
function of several variables and equate to zero the two first 
d E 3 E 

partial derivatives, and This gives us the two equations 

~ = 2 (yi - a - bsiH-l) + 2 (y 2 - a - bx 2 )(-l) + • • • 

+ 2 (y n — a — bx tt )(~l) = 0 

~ = 2 ( 2/1 - a - bxi)(—xi) + 2(^/2 — a — bx 2 )(—x z ) + * * • 

+ 2(y n — a — bx n )(-Xn) - 0 


* A brief discussion of the reasons for this will be found in A. M. Mood, 
“Introduction to the Theory of Statistics,” p. 311, McGraw-Hill Book 
Company, New York, 1950. 


128 


FINITE DIFFERENCES 


CHAP. 4 


or, dividing by 2 and collecting terms on the unknown coefficients 
a and b, 

(4) na + b V Xi = Y m 

i=i i=i 

(5) a £ a;, + 6 V Xi 2 = 

t=l ial 

Equations (4) and (5) are two simultaneous linear equations 
whose solution for a and b presents no difficulty. 

For i — 1, 2, . . . , «, (2) defines a system of n equations in 
the two unknowns a and b which should, ideally, be satisfied, 
but which actually are not. Moreover, minimizing E is nothing 
more than minimizing the sum of the squares of the amounts by 
which these n equations fail to be satisfied. 

The preceding observation suggests a somewhat more general 
point of view, namely, that the method of least squares is simply 
a process for finding the best possible values for a set of m un- 
knowns, say x\, Xi, . . . , x m , connected by n linear equations 

On*i + auXi + ’ ’ * 4* UimX m — &i 

021*1 + 022*2 ■+•”*+ «2m*ffl =* bi 


a n \Xi + ®»2*2 CLnmXm = &» 

when n > m. Since the number of equations exceeds the number 
of unknowns, the system presumably does not admit of an exact 
solution; i.e., there is no set of values for xi, x 2 , . . . , x m for 
which each equation is exactly satisfied. Hence, we consider the 
discrepancies 

Si — <XnXi + diiXi +••..*+ dimXm. — bi 0 i ~ 1, 2, . . . , n 
and attempt to find values for xi, x s , . . . , x m which will make 
E - V Si 2 = V (anxi 4~ a i2 x 2 + • ■ • + a im x m — bi ) 2 

*- 1 t=i 

as small as possible. 

To minimize E we must equate to zero each of its first partial 
derivatives 

dE dE dE_ 

dXi dx 2 ' dx m 

dE 

For ^ this gives the equation 

dE V rt/ 

dXl ~~ l a iZ X ‘l 4 ~ ' ' ' 4 " di m X m ~ bi) (dn) = 0 

or 

*i ^ diidn + *2 ^ aiian 4- • • • 4- *m ^ diicum — ^ anbi 


THE METHOD OF LEAST SQUARES 


and similarly, for the other partial derivatives, 

Xi Jj? a i2 an x<l ^ anda 4- • ■ • + x m ^ a^dim = ^ a^bi 

X 1 ^ a lm a ix -j- X 2 dirndl 2 ~ t - * 4" Xm ^ dim&im — ^ dimb{ 

We have thus obtained a system of m linear equations in the m 
unknowns Xi, x 2 , . . . , x m , whose solution is now a routine 
matter. As a practical detail, it is worthy of note that these 
minimizing conditions, or normal equations, as they are usually 
called, can be written down at once according to the following 
rule : 

rule 1 If each of n linear equations in the m unknowns 
■d, a- 2 , . . . , x m n > rn- 

is multiplied by the coefficient of Xi in that equation, the sum of 
the resulting equations is the z‘th normal equation in the least- 
square solution of the system. 

EXAMPLE 1 

By the method of least squares, fit a parabolic equation y — a + bx + cx* to the data: 


18 


10 


0 


Substituting these pairs of values into the equation y — a + bx 4- cx i , we find that a, b, 
and c should satisfy the conditions 

a — 36 4- 9c = 18 
a - 2b + 4c = 10 
a =2 

a + 36 + 9c = 2 
a 4- 46 + 16c - 5 

In general, three unknowns cannot be made to satisfy more than three conditions; hence, the 
most we can do is to determine values of a, 6, and c which will satisfy these equations as nearly 


To set up the first of the three normal equations required by the method of least squares, 
we must multiply each of the equations of condition by the coefficient of a in that equation 
and add, getting in this case simply the sum of the five equations: 

5a + 26 + 38c = 37 

To set up the second normal equation, we multiply each equation by the coefficient of 6 in that 
equation and add, getting 

-3a + 96 - 27 c = -54 
-2a 4- 46 - 8c = -20 
0+04-0=0 
3a + 96 + 27c = 6 

4a + 166 + 64c = 20 


2 a + 386 + 56c = -48 


130 FINITE DIFFERENCES CHAP. 4 


In the same way, multiplying each equation by the coefficient of c in that equation, we get the 
third normal equation: 

9a - 27 b + 81c = 162 
4a - 8 6 + 16c = 40 
0 -f 0 + 0 = 0 
9a + 276 + 81c = 18 
16a +■ 646 + 256c = 80 
38a + 566 + 434c = 300 

The solution of the three normal equations is a simple matter, and we find 
a = 1.82 6 = -2.65 c - 0.87 

The required solution is therefore 

If = 1.82 — 2.65# + 0.87;r s 

When, as is often the case, the abscissas of the points to 
which we wish to fit a polynomial curve are equally spaced, the 
labor involved in the least-square procedure we have just de- 
scribed can be significantly reduced by using what are known as 

orthogonal polynomials. 


DEFINITION 1 

If n -f- 1 polynomials P nM (x) of respective degrees m = 0, 1, 2, . . . , n have the 
property that 

(6) t Pnj(x)P nk (x) = 0 j * k 

they are called orthogonal polynomials. 


(7) 


By methods which need not concern us here,* it has been shown 
that, for each n, there exists a set of n + 1 orthogonal poly- 
nomials, and the general formula for them has been obtained : 



In particular, 
Pno(x ) = 1 

Pnl(x) = 1 - 2 ^ 
n 


a *(* - 1) 

L n(n — 1) 

P n ,(x) = 1 - 12 - + 30 - “ 20 

n n(n — 1) 


x(x — l)(.c — 2) 
n(« — 1 )(n — 2) 


Clearly, for each m S n, any polynomial of degree m can be 
expressed as a linear combination of the polynomials 
Pno(x), Pnl(x), . . . , Pnm(x) 


’"See, for instance, W. E. Milne, “Numerical Analysis," pp. 265-275 and 
375-381, Princeton University Press, Princeton, N.J., 1949. 



SEC. 4.6 


THE METHOD OF LEAST SQUARES 


for the expression 
( § ) P(x) = OqP „ 0 (:r) + aiP n i(x) + • • • + a m P nm (x) 

is obviously a polynomial of degree m containing the maximum 
number, m + 1, of independent, arbitrary constants which can 
appear m the general polynomial of this degree. Moreover the 
coefficients a Q , a 1} . . . , a m in (8) can easily be found. For if we 
multiply both sides of this identity by P ni ( x ), say, and then sum 
from x — 0 to x = n, we get 

P( X )P ni(x) — a 0 P no(x)P ni (x) +**’+«,- ^ P ni 2 ( X ) -f 

*••+«- S Pn m (x)P ni ( X ) 

But, from the so-called orthogonality property of the polynomials 
{P« m (.r)}, which is expressed by Eq. (6), it follows that every 
term on the right-hand side of the last expression is zero except 
the sum 

at 2 Pm^x) 

Hence, solving for a i} we obtain the formula 

(9) 0,1 = ° n i = 0, 1, . . . , m 

2 Pni'W 

x-Q 

The property described by (6) is a very important one and 
we shall encounter it again in Sec. 11.2 when we attempt, in a 
manner analogous to the expansion (8), to express an arbitrary 
vector as a linear combination of certain given, independent 
vectors. Also, in Chaps. 6, 8, and 9 we shall study expansion prob- 
lems resembling (8), in which the coefficients will be determined 
through the use of orthogonality properties involving integrals 
rather than sums, as in (6). 

Clearly, an expansion of the form (8) can be created for any 
function f(x), polynomial or not, merely by using the coefficient 
formula (9) with f(x) replacing P(x). Such expansions are of great 
importance, for, although it is obvious that they cannot represent 
f(x) exactly unless f(x) is a polynomial of degree n or less, they 
provide the best polynomial approximations to/(.r) in the least- 
square sense. To prove this, suppose that we have a function 
f(x) defined for the n + 1 equally spaced values x = 0, 1, n 
which we wish to approximate with a polynomial of degree 
m (<n). If we assume the polynomial to be written in the form 
(8), the discrepancy at the general point x is 

f(x) - a Q P n0 (x) - ■ • ■ — aiP ni (x) 


O-mPnmix) 


FINITE DIFFERENCES 


CHAP. 4 


and the principle of least squares requires that we minimize the 
sum 


(10) E = | [f(x) ~ a„P B0 (x) - • • • - a^x) - • • • - a m P nm (*)p 

r =0 

If we equate to zero the derivative of E with respect to a i} say, we 
obtain the general minimizing condition 

—■ = y 2[/(rc) ~ a 0 P nn (x) — . • ■ — GiP »;(x) . . . _ a, m P nm(x)]P n i(x) — 0 

Sa ‘ ho 

or, breaking up the sum, 

(11) X f(x)P ni (x) - oo £ P«o(x)P„i(x) - • • • 

* = 0 *=0 

- di 2 Pni 2 (x) “ 0» JJ Pnm(x)Pni(x) - 0 

a = 0 * = 0 

But, from the orthogonality of the P’s, the sums involving two 
different P’s are all zero, and Eq. (11) reduces to 


(12) 


(13) 


2 f(x)Pni(x) ~ a,- 2 Pni 2 (x) « 0 

* = 0 x==p 


or 


j /MP„«W 

^ i — 0, 1, . . . , m 

X p-’w 


which is exactly the same as (9) with P(x ) replaced by f(x). 

The advantage of using orthogonal polynomials is now clear. 
In the first place, through their use the coefficients in the least- 
square polynomial approximation to a function f(x) defined for 
the n + 1 equally spaced values x — 0, 1, . . . , n can be found 
one at a time without the necessity of solving any simultaneous 
equations. In the second place, since Formula (12) for a* does not 
involve m, the degree of the polynomial we are fitting to the data, 
it follows that, if we desire to increase m, that is, add another 
term to the approximating polynomial, all previously calculated 
coefficients remain unchanged and only the coefficient of the new 
term need be computed. 

The sum appearing in the denominator of (12) need not be 
calculated directly because a general formula for it is available, 
namely, 


| p , ,'(*) - 

x — O 


(n -f- i 4- l)C»+»t 
(2i + l)(n)W 


t See, for instance, Milne, loc. tit. 



THE METHOD OF LEAST SQUARES T33 

In particular, 

2 P n0 *( x ) = n + 1 

s = 0 

^ Pnl 2 (x) = ( n + 1)( W + 2) 

y p n2 i( x ) - ( w + !)( n + 2 )( n + 3) 

^0 " 5n(n - 1) 

V p 2 /^ _ (w + l)(w -f- 2)(w + 3)(n + 4) 
l Q n3 l J 7»(» - l)(n - 2) 

To determine the accuracy with which the polynomial 
approximation fits the data it is not necessary to compute E from 
(10), since it can be shown that, in general, 

e = t yt*) - 1 [«<■ X 


Using orthogonal polynomials, fit equations of the form y = ao + <M and y - 
the data: 


t 

0.00 

0.25 

0.50 

0.75 

1.00 

y 

0.00 

0.06 

0.20 

0.60 

0.90 


As a first step we must introduce an auxiliary variable x — it which will take on the values 
0,4, 2, 3, 4 when t takes on the given values 0.00, 0.25, 0.50, 0.75, 1.00. Then, because there are 
five given points to which the required curves are to be fitted, we observe that n + 1 = 5 or 
n * 4. Next, lacking tables of the orthogonal polynomials, we must compute the values of 
Pio(z), P 4 i(a:), and P«(x) for the five values x - 0, 1, 2, 3, 4. This is a simple matter, of course, 
and the values shown in the accompanying table can be calculated at once. It is then necessary 
to compute the sums of the products of the respective values of the y ’ s and each of the P’b. 
These products are shown in the last three columns of the table. 


t 

X 

y 

P« 

p« 

p« 

yPto 

yPn \ 

yPn 

0.00 

0 

0.00 

1.000 

1.000 

1.000 

0.000 

0.000 

0.000 

0.25 

1 

0.06 

! l.ooo 

0.500 

-0.500 

0.060 

0.030 

-0.030 

0.50 

2 

0.20 

1.000 

0.000 

-1.000 

0.200 

0.000 

-0.200 

0.75 

3 

0.60 

1.000 

-0.500 

-0.500 

0.600 

-0,300 

-0.300 

1.00 

4 

0.90 

1.000 

-1.000 

1.000 

0.900 

-0.900 

0.900 


£ fV « 5.000 £ yPw - 1.760 

a; == Q x " 0 

4 4 

£ P«* - 2.500 £ yPu = -1470 

x = 0 a: = 0 

4 4 

£ P«* = 3.500 £ yPa = 


0.370 


134 


FINITE DIFFERENCES 


CHAP. 4 


The coefficients ao, ai, and a t are then given by Eq. (12): 




1.760 

5.000 


= 0.3520 


-1.170 

2.500 


In terms of x, the line of best fit is, therefore, 

y - aoP 4 t>(x) + aiPu(x) - 0.3520 - 0.4680 ^1 - 0 - -0.116 + 0.234* 

and the parabola of best fit is 

y = ao Pto(x) + aiPu(x) + asP«s(aO 
- 0.3520 - 0.4680 ^1 - + 0.1057 ^1 

= -0.0103 + 0.0226a; + 0.0529a; a 
Then, by setting x — it, we obtain the curves of best fit for the data as originally given : 

y = —0.116 + 0.936< and y — —0.0103 + 0.0904i + 0.8464< 2 
Using Eq. (14) we find that the sum of the squares of the departures of the points from the 
line and from the parabola of best fit are, respectively, Ei — 0.0465 and E$ = 0.0074. From 
the relative size of E\ and E« we conclude that the parabola fits the data significantly better 
than the straight line does. 


3a; x i — x\ 
_ + 2 J 


In many important applications the position of a moving 
object is observed at a series of equally spaced times, and its 
velocity and acceleration are required at these times. Clearly, 
these can be estimated by using the formulas for numerical 
differentiation we obtained in Sec. 4.3. However, since the 
interpolation polynomials from which these formulas of numerical 
differentiation were derived fit the raw data exactly, they and 
any formulas based on them are seriously influenced by even 
small errors in the data. On the other hand, a polynomial curve 
fitted to the data or a portion of the data by the method of least 
squares will be less influenced by random errors and will represent 
more nearly the underlying, presumably smooth trend of the 
data. Hence, derivatives computed from such approximating 
functions will in general be more accurate than those computed 
by differentiating interpolation polynomials. These ideas are 
illustrated geometrically in Fig. 4.5, where it is clear that the 


FIGURE 4.5 
The relative 
smoothness of an 
interpolation 
polynomial and a 
least-square 
approximation to 
a set of points. 


Interpolation polynomial 




SEC. 4.6 


THE METHOD OF LEAST SQUARES 


135 


(15) 


slope of the interpolation polynomial fluctuates markedly from 
point to point, whereas the slope of the least-square approxima- 
tion changes in a smooth fashion, which is almost certainly a 
more reliable description of the trend the data would exhibit if 
free from random errors. 

In applying these considerations to the analysis of observed 
positional data, it is customary to fit a polynomial curve of rela- 
tively low degree, say a parabola, to successive sets of 5, 7, or 9 
observations and then take the ordinate and the first and second 
derivatives of this approximating polynomial at the central point 
of the set as the corrected, or “smoothed/’ position, velocity, and 
acceleration of the body at that instant. 

To illustrate this technique, let us use the method of orthog- 
onal polynomials to fit a parabola to the five points 


t 

X 

y 

-2 h 

0 

Z/-2 

- h 

1 

y - 1 

0 

2 

yo 

h 

3 

Vi 

2h 

4 

2/2 


To do this, we use the coefficient formula (12) and the values of 
Pio(x), P Ai (x), and P^(x) tabulated in Example 2. The results are 
found immediately to be 


y - 2 + y ~ i -f yo + vx + y% 

«0 = ? 

5 

ai = i ( 2/ - 2 + u f~2~ y 2 ) 

2 / i 2/1 , \ 

«2 = 7 { y-1 - *2 ~ Vo - 2 + 2/2 ) 

and thus the formula for the approximating polynomial is 

a Q P i0 (x) + aiPu(x) + d'tPizix) 


\ (y - a + y- 1 4- yo + yi + yi) 


+ l{ y ~* + V - 1 - »*) i 1 ~ l) 

+ »•)(! -|* + 



When * = 2, we find for the smoothed mid-ordinate 

— 3y_2 + 12y_i + l7yo + 12yi — 3^2 
35 


To approximate dy/dt, we recall that x = 2 -f - tjh; hence 

dy _ dydx _ Idy 
dt ~ dxdt ~ hdx 


( 16 ) 


136 


FINITE DIFFERENCES 


CHAP. 4 


(17) 


(18) 


(19) 


( 20 ) 


Therefore, differentiating (15) with respect to x and then setting 
x - 2, we find for the smoothed value of the first derivative at the 
central point of the set 

v , _ —2^-2 — y-i + yi + 2 y 2 

°~ 10 h 

Similarly, a second differentiation would yield 

v" ~ 2 y~ 2 ~ v - 1 ~ M " 2/ 1 + 2 ^ 2 

r ° ~ 7/i. 2 

as the smoothed value of the second derivative at the central 
point of the set. However, it is better to find Y" by applying 
Formula (17) to the table of the smoothed first -derivative values, 
getting 

„„ — 2F1 2 - Fij + Y'x + 2F£ 

~ 10k 

or, replacing each derivative by its expression in terms of the 
appropriate i/- values from (17), 

v n 4t/- 4 + 4y-s + V-2 - 4j/_i - 10j/o - 4 yi + ?/ 2 + + ty i 

0 lOO/i 2 

Since Eqs. (16) and (17) require a knowledge of two ordinates 
on each side of those being smoothed, it is evident that these 
equations can be used only for points after the second and before 
the (n — l)st in a table of data. Similarly, Formula (20) for the 
second derivative can be used only between the fifth and the 
(n — 4)th points, inclusive. To smooth to the ends of a table we 
must derive auxiliary formulas from Eq. (15) by evaluating it 
and its derivatives at x ~ 0, 1, 3, and 4 as well as at x — 2. 
These results will be found among the exercises at the end of this 
section. In general, central formulas, that is, formulas in which 
the element being smoothed is as near as possible to the central 
member of the set of data appearing in the smoothing formula, 
should be used wherever possible. 

The method of least squares is not limited in its application 
to problems in which the equations of condition are linear. Some- 
times, by a suitable transformation, the problem can be con- 
verted into one in which the parameters do enter linearly. For 
instance, to fit an equation of the important type y — ae hx , we 
can take the natural logarithm of each side, getting 

In y — In a + bx 

Then, considering a; and In y as new variables, say X and F, and 
In a and b as new parameters, say A and B, we can regard the 
problem as requiring the determination of A and B such that the 
linear equation 

F = A + BX 


SEC. 4.6 


THE METHOD OF LEAST SQUARES 


137 


gives the best possible fit to the known pairs of values of X ( — x) 
and Y ( = In y ). Once /I has been found, it is, of course, a simple 
matter to find the actual parameter a, since A — In o. 

Similarly, the fitting of a function y = kx n can be reduced 
to a linear problem by first taking logarithms (preferably to the 
base 10), getting 

log y = log k + n log x 

This equation is linear in the parameters K — log k and N. — n. 
Hence the determination of the parameters can be carried out as 
outlined above. 

On the other hand, it is not possible to make a rigorous 
linearization of general systems of nonlinear equations of condi- 
tion. But if a reasonable approximation to a solution of such a 
system is available, an approximate linearization of the problem 
can be achieved in the following way: 

Let the equations to be satisfied (as nearly as possible) be 

(21) Mx,y) * 0, f 2 (x,y) =0, .... f n (x,y) = 0 

and suppose that (x 0 ,yo) is known, by inspection or otherwise, to 
be an approximate solution of this system. Then we can expand 
each function fi(x,y) in a generalized Taylor series about the 
point (x 0 ,yo), getting 

Mx,y) = Mxo,y n ) + (x - x 0 ) + (y - yo) 

+ 1 [IS L. + 2 £% h » (x - x,Uy - y,) 

+ wlJy-y°r)+ ■ ■ ■ 

Now, if (xo,yo) is a reasonable approximation to the required 
solution, the quantities x — x 0 and y — yo will be small, and 
hence their squares, products, and higher powers will be negligible 
in comparison with the quantities themselves. Omitting these 
terms thus reduces the set (21) to the system 

( 22 ) ««.») “ # 

which is linear in the unknown corrections x — a: 0 and y — y o. 
The method of least squares can now be applied to the system 
(22) in a straightforward way, following which the preliminary 
estimate ( x Q ,y 0 ) can be appropriately corrected. Of course, if 
desired, the given functions fi(x,y) can be expanded about the 
corrected solution (xi,yi) and the process repeated. The extension 
to systems with more than two unknowns 

fi(x,y,z, ...)== 0, fi(x,y,z, . . .) = 0, . . . , Mx,y,z, . . .) = 0 

is immediate. 


138 


FINITE DIFFERENCES 


CHAP. 4 


EXAMPLE 3 

Fit an equation of the form y == kz n to the data: 


X 

i 

2 

3 

4 

y 

2.500 

8.000 

19.000 

50.000 


and compute the value of E. 

First let us work the problem by using the logarithmic equivalent 


log y - log k *f- n log x 

of the function we are trying to fit to the data. Then the equations of condition are 
0.3979 = log k 
0.9031 = log k + 0.3010a 
1.2788 = log k + 0.4771n 
1.6990 = log k + 0.6021ra 

and from these, by the usual process, we obtain the normal equations 
4.0000 log k + 1.38D2w = 4.2788 
1.3802 log k + 0.6807n = 1.9049 

From these we find log k = 0.3472 and n = 2.096. Hence, k = 2.224, and the required function 
is 

y = 2.224a; 4 - 096 

To find E we must evaluate the function y — 2.224a; 4 - 090 for x =» 1, 2, 3, 4; subtract these 
results from the corresponding values of y as originally given; square these differences; and add 
them. The work is shown in the following table: 


X l 

y (= 2.224a; 4 - 096 ) 

y (given) 

S 

S t 

1 

2.224 

2.500 

0.276 

0.076 

2 

9.510 

8.000 

-1.510 

2.280 

3 

22,243 

19.000 

-3.243 

10.517 

4 

40.655 

50.000 

9.345 

87.329 





E = 100.202 


Although we have no real basis for such a conviction, this value of E should strike us as 
discouragingly large, especially in view of the fact that we have tried to choose, the parameters 
k and n to make it as small as possible. To explore the matter further, let us reconsider the 
problem in a more elementary way and determine k and n so that the curve will pass exactly 
through the points (3,19) and (4,50) without regard to the remaining pair of points. This requires 
that 

19 = A*3“ and 50 = k4 n 

Dividing the second equation by the first gives us (%)» = 5% g . Hence, taking logs. 


log 50 — log 19 
log 4 - log 3 


3.36 


SEC. 4.6 


THE METHOD OF LEAST SQUARES 


139 


With n known, it is easy to find k from the equation 19 = kd n : 

log k = log 19 - 3.36 log 3 = 9.67563 and k - 0.474 
Now, for the function y = 0.474a: 3 - 36 , the calculation of E leads to the following results: 


X 

y (= 0.474a: 3 - 33 ) 

y (given) 

S 

5 s 

1 

0.474 

2.500 

2.026 

4.105 

2 

4.865 

8.000 

3.135 

9.828 

3 

19.000 

19.000 

0.000 

0.000 

4 

50.000 

50.000 

0.000 

0.000 





E = 13.933 (!) 


This is a remarkable improvement in the closeness of fit, which surely requires explanation. 

The question will become clearer if we consider the sums of the squares of the errors asso- 
ciated with the respective functions y — 2.224x s 096 and y = 0.47 4a -3 - 36 when they are written 
in logarithmic form. These are: 


* 

log 2 / ( == log 2.224 + 2.096 log x) 

log y (given) 

5 

5 2 

1 

0.3471 

0.3979 

0.0508 

0.00258 

2 

0.9782 

0.9031 

-0.0751 

0.00564 

3 

1.3472 

1.2788 

-0.0684 

0.00468 

4 

1 . 6091 

1.6990 

0.0899 

0.0080S 





E - 0.02098 


X 

log V ( = log 0.474 -f 3.36 log x) 

log y (given) 

S 

S i 

1 

-0.3244 

0.3979 

0.7233 

0.52172 

2 

0.6871 

0.9031 

0.2160 

0.04666 

3 

1.2788 

1 . 2788 

0.0000 

0.00000 

4 

1.6990 

1 . 6990 

0.0000 

0.00000 

E = 0.56838 


The function y — 2.224.r 2 00(1 which we fitted logarithmically by the method of least squares 
fits the logarithms of the data much better than does the second function we derived. Moreover, 
it does this by keeping the discrepancies 5,- about equally small. However, a given difference <5 in 
the logarithms of two numbers represents only a small difference in the numbers if the logarithms 
are near zero, but represents a large difference if the logarithms themselves are large. Thus, for 
a change of 0.10000 in the logarithms, we might have either 

0.10000 = logarithm of 1.259 
0.00000 = logarithm of 1.000 
Difference of the numbers = 0.259 

or 1.60000 = logarithm of 39.811 

1.50000 - logarithm of 31.623 
Difference of the numbers = 8.188 


Hence, the average approximation to the original data is significantly improved by keeping the 
errors in the larger logarithms as small as possible, even at the expense of considerably larger 


140 


FINITE DIFFERENCES 


CHAP. 4 


errors in the smaller logarithms. And, clearly, there is no reason to believe that the function 
which best fits the logarithms of the data will necessarily give the best approximation to the 
data themselves. 

As a final approach to the problem, let us now try the general method of handling nonlinear 
equations of condition. Assuming again an equation of the form y — l;x n and substituting the 
four given sets of values, we find that k and n should satisfy the conditions 

2.5 = k 
8.0 = k2 n 

19.0 - &3» 

50.0 - M» 

As an initial estimate of the values of k and n, let us use the values k = 0.474 and n — 3.36 
which we obtained by passing the curve exactly through the points (3,19) and (4,50). Then 
expanding each of the equations of condition in a Taylor series around (0.474,3.36), we find 

fi=k- 2.500 = -2.026 + (k - 0.474) = 0 

/* = k2 n - 8.000 * (4.865 - 8.000) + 2" I (Jfc - 0.474) 

10.474,3.30 V 

+ Jfc2» In 2 (n - 3.36) 

|0.474,3.3B 

= -3.135 + 10.267 (A - 0.474) + 3.372(m - 3.36) = 0 
ft » k3 » - 19.000 = (19.000 - 19.000) + 3“ | q ^ g ^ (k - 0.474) 

+ ifc3 n In 3 (n - 3.36) 

10.474,3.36 V ‘ 

« 40. 098 (/c - 0.474) + 20. 874 (n - 3.36) = 0 
U ® M" - 50.000 = (50.000 ~ 50.000) + 4" (k - 0.474) 

10.474,3.36 

+ &4» In 4 (n - 3.36) 

10.474,3.36 v 

= 105.411 (lb - 0.474) + 69.314(n - 3.36) « 0 

Letting u - k — 0.474 and a - n — 3.36, the approximate equations of condition are, therefore, 

u m 2.026 

10.267u + 3.372a = 3.135 
40.098m + 20.874a = 0.000 
105.411m + 69.314a = 0.000 

The construction of the normal equations, by multiplying each equation of condition first 
by the coefficient of u and then by the coefficient of a in that equation and adding, is a routine 
matter, and we find without difficulty 

12, 825.740m + 8,178.084a - 34.213 
8, 178.084u + 5,251.525a = 10.571 
Hence, u = 0.197 and a = -0.305 


SEC. 4.6 


THE METHOD OF LEAST SQUARES 


141 


and the corrected estimates of k and n are 

k = 0.474 + 0.197 = 0.671 
n = 3.36 - 0.305 = 3.055 

For the function y = 0.671a; 3 065 a straightforward calculation yields E — 22.628, which is still 
not so small as the value we found for the curve that passed exactly through the points (3,19) 
and (4,50). However, a second application, based upon expanding the equations of condition 
around k = 0.671 and n — 3.055, yields the improved values 

k = 0.733 and n = 3.039 

and E = 10.052, which is the smallest value of E we have yet found. Another repetition of the 
process would no doubt improve this slightly. 

EXERCISES 

1 Fit a straight line to the data: 


- 

i 

3 

6 

7 

9 

y 

i 

5 

0 

10 

12 


(a) by minimizing the sum of the squares of the vertical distances from the points to the 
line, and (b) by minimizing the sum of the squares of the horizontal distances from the 
points to the line. 

2 Fit an equation of the form y — a + bx 4- cx 1 to the data: 


, 1 

-l 

0 

2 

3 

5 

y 

—4 

4 

8 

9 

7 


3 Find the most plausible values of x and y from the following system of equations : 

x + y - 2 
2x — 3y = 9 
20a: + 16?/ = 4 

(a) without dividing out the factor 4 from, the last equation, and (b) after dividing out 
the factor 4 from the last equation. Explain. 

4 Fit equations of each of the forms: 

a ax + by — 1=0 b ax y — c — 0 c x + by— c = Q 

to the data: 


s 1 

o 

1 

2 

3 

V i 

1.1 

1.9 

3.0 

3.9 


by minimizing the sum of the squares of the amounts by which each of the equations, in 
turn, fails to be satisfied. Compare the results and explain the differences. 


142 


FINITE DIFFERENCES 


CHAP. 4 


£ Pn*(x)Pnl(x) =0 J P«a(*)P„*(z) =0 ]J P»1 (*)P»S! (*) = 0 


2 P„ 0 2 (a:) =71 + 1 £ Pni 2 (s) 


, ( n + !)(” + 2 )(« + 3) 
5n(n — 1) 


c Express x and x t as series of the form 

aaP„o(x) + ai P ni (x) + a°P n z(x) + • • • 

6 Using orthogonal polynomials, fit functions of each of the forms y '*• a + bx and y - 
a + bx + cx i to the data: 


* 

0.50 

1.00 

1.50 

2.00 

2.50 

3.00 

V 

1.01 

1.08 

1.16 

1.25 

1.29 

1.30 

and compute the value of E for each approximation. 

Fit an equation of the form y — Ae ax to the data: 



- 

1 

2 

3 

4 



y 

1.65 

2.70 

4.50 

7.35 




a By first taking logarithms and then working with the linearized equation 
In y ** In + + ax 

b By first obtaining approximate values of + and a and then linearizing by expanding the 
equations of condition in Taylor’s series around these values and retaining only the linear 
terms. 

8 Derive the following modifications of Formulas (16) and (17): 

To = H 5 (31 ye + 9t/i — 3 yt — 5 y 3 + 3?/<) 

Yo - ~ (“54yo + 13y, + 40y 2 + 27 y, - 26y 4 ) 


and F 0 = Hs(9y-i + 13y„ + 12y t + 6y 2 - 5y,) 

Y'a - ~ (“34 + 3y 0 + 20 y x + . 17yi - 6y s ) 

9 Derive Formula 14. 

10 It is desired to fit a circular arc to a set of points (zi,yi), . . . , (x n ,y„). Discuss the relative 
merits of doing this by minimizing the sum of the squares of the vertical distances from the 
points to the circular arc and by taking the equation of the circle in the form x 2 + y 2 + 


SEC. 4.6 


THE METHOD OF LEAST SQUARES 


143 


ax + by + c — 0 and minimizing the sum of the squares of the amounts by which the 
coordinates of the points fail to satisfy this equation. 

11 It is desired to fit an equation of the form y = Ae ax to a set of points (x h yi), . . . , (x n ,y n ). 
By observing that y must satisfy a certain linear, constant-coefficient, first-order difference 
equation, obtain the following equations of condition: 

2/a - e a yi ~ 0 
2/3 — e°2/2 = 0 


2/n - e a y n ~i = 0 

Show how A can be found after the best least-square approximation to a has been found 
from these equations. Discuss the advantages of this method relative to the method of 
linearizing by taking logarithms and the general method for handling problems in which 
the parameters enter nonlinear ly. 

12 Explain how the method of Exercise 11 can be extended to the fitting of functions of the 
form y = Ae 0 * + Be bx +•••’+ Ke kx . 

13 Explain how a continuous function can be approximated over an interval (a, b) by minimiz- 
ing the integral of the squared difference between the given function and the chosen approxi- 
mation. Illustrate by approximating the function cos x over the interval (0,tc/2) with a 
function of the form y = a — bx a . 

14 Approximate the solution of y" -j- x-y ~ sin x for which y( 0) = y(ir) — 0 by assuming 
y — A sin x and choosing A to minimize the integral from 0 to x of the square of the amount 
by which A sin x fails to satisfy the differential equation. 

16 Show that if a line with equation x cos 0 + y sin 0 — p = 0 is fitted to a set of points 
(aJi,T/i), . . . , (x n ,y-n.) by minimizing the sum. of the squares of the perpendicular distances 
from the points to the line, the value of 6 is given by the formula 


tan 28 


2 TxyOxOy 


ov — ov 

where a x and <r v are, respectively, the so-called standard deviations of the x-values and the 
y-values, 


and r zv is the coefficient of correlation between the ac-values and the y-values, 

n V XiVi - V a;,- V y> 

i=i i=a 

r*v “ i 

n~<r x <j v 

What is the value of p in the equation of the line of best fit? 



CHAPTER FIVE 


Mechanical 

and 

Electrical Circuits 

5.1 

Introduction An examination of the application of differential equations to 
mechanical and electrical systems is valuable for at least two 
reasons. In the first place, it will furnish us with useful informa- 
tion about the behavior of certain physical systems of great 
practical interest. Second, and perhaps more important, it will 
provide a striking example of the role which mathematics plays 
in unifying widely differing phenomena. For instance, we shall 
see that, merely by a renaming of the variables, the analysis of the 
motion of a weight vibrating on a spring becomes the analysis of 
a simple electrical circuit. Moreover, this correspondence is not 
merely qualitative or descriptive. It is quantitative, in the sense 
that if one is given any of a wide variety of vibrating mechan- 
ical systems, an electrical circuit can be constructed whose cur- 
rents or voltages, as preferred, will give the exact values of the 
displacements in the mechanical system when suitable scale 
factors are introduced. Since electrical circuits are easy to 
assemble and since currents and voltages are easy to measure, 
this affords a practical method of studying the vibration of com- 
plicated mechanical configurations, such as engine crankshafts, 
which are expensive to make and modify and whose motions are 
difficult to record accurately, * 


Systems with one degree of freedom 

A system which can be described completely by one coordinate, 
i.e., by one physical datum such as a displacement, an angle, a 


* Of course, mechanical models of electrical circuits can also be constructed, 
but there is little practical reason for constructing them. 



SEC. 5.2 


SYSTEMS WITH ONE DEGREE OF FREEDOM 


T45 


FIGURE 5.1 
Four simple sys- 
tems of one 
degree of free- 
dom: (a) trans- 
lational- 
mechanical; (b) 
torsional- 
mechanical; (c) 
series-electrical ; 

( d ) parallel- 
electrical. 


current, or a voltage, is called a system of one degree of freedom. 
A system requiring more than one coordinate for its complete 
description is called a system of several degrees of freedom. A 
single differential equation, suffices for the mathematical descrip- 
tion of a system of one degree of freedom. A set of simultaneous 
differential equations, as many equations as there are degrees of 
freedom, is necessary for the analysis of systems of more than one 
degree of freedom. We shall begin our investigations by consider- 
ing, as prototypes of the general system with one degree of 
freedom, each of the configurations shown in Fig. 5.1. In each 
case we assume that all the elements of the system are concen- 
trated, or lumped. In other words, such things as the distributed 
mass of the spring in Fig. 5.1a, the distributed moment of inertia 
of the shaft in Fig. 5.16, and the resistance of the leads in Fig. 5.1c 
and d we assume to be either negligible or taken into account 
through suitable corrections added to the corresponding major 
elements. * 



(a) Coordinate = vertical 
displacement of weight y 


Elastic 
shaft k 


Friction I I Moment of 
device c I inertial 

Disturbing 
-ok-*" torque T 

(6) Coordinate = angular 
displacement of disk d 


Resistance Capacitance 
R c 

| VW 1(- 


Impressed 
voltage E 

Inductance L 
1 


■> 


(c) Coordinate = current i, 
flowing around loop 



(d) Coordinate = common voltage e 
between nodes A and B 


* In many problems these assumptions are not sufficiently accurate, and the 
continuous distribution of the components of the system must be considered. 
As we shall see in Chap. 8, this leads to partial rather than ordinary differen- 
tial equations. 



146 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


In Fig. 5.1a we assume the weight to be guided, so that only 
vertical motion, without swinging, is possible. As indicated, the 
effect of friction is not neglected. Instead, we suppose that a 
retarding force proportional to the velocity acts at all times. 
Friction of this sort is known as viscous friction or viscous 
damping. Its existence is well established for moderate velocities, 
although for large velocities the resistance may be more nearly 
proportional to the square or even the cube of the velocity. 

The analysis of this system is based upon Newton's law, 

(1) Mass X acceleration = force 

Measuring the displacement y from the equilibrium position of 
the weight, with the positive direction upward, we have 

(2) Acceleration of the mass = ~ 

The most obvious force acting on the mass is the attraction of 
gravity: 

(3) Gravitational force = —w 

the minus sign indicating that this force acts downward. To com- 
pute the elastic force, we observe first that a weight w when 
hung on a spring of modulus k, that is, a spring requiring k units 
of force to extend it one unit of length, will stretch the spring a 
distance equal to w/k. Hence, when the weight moves from this 
equilibrium level during the course of its motion, the instantane- 
ous elongation of the spring is w/k — y. If this quantity is posi- 
tive, the spring is stretched and, therefore, applies to the mass a 
force which acts in the upward, or positive, direction. If this 
quantity is negative, the spring is compressed and, therefore, 
applies to the mass a force which acts in the downward, or nega- 
tive, direction. The force the spring exerts on the mass at any 
time is, therefore, 

Force per unit elongation X instantaneous elongation 
or 

(4) Elastic force = k ^ y'j — w ~ ky 

To determine the frictional force, we observe that the velocity of 
the mass is dy/dt] hence, from the assumption of viscous damping, 

(5) Frictional force = — c ~ 

the minus sign indicating that the resistance always acts in 
opposition to the velocity. Finally, through some external 
agency, a disturbing force, usually periodic, may act upon the 
system, upsetting its condition of equilibrium. We shall consider 
specifically the important case in which 

(6) Impressed force = F 0 cos wt F 0 a constant 



SEC. 5.2 


5YSTEMS WITH ONE DEGREE OF FREEDOM 


147 


Substituting from Eqs. (2) to (6) into Newton’s law, Eq. (1), 
we thus have 

w dhj dy . _ 

gW ~ ~ W + {W ~ kv) ~ c M +F ' ,mS “‘ 
or 

(7) fw +c § + *y = r *™* 1 

We note from this equation that the gravitational force on the 
weight is canceled by that part of the elastic force due to the 
initial elongation of the spring. Because of this, in the future we 
shall neglect gravitational forces from the outset in the analysis of 
problems of this sort. 

Equation (7) is a typical nonhomogeneous, linear differential 
equation of the second order with constant coefficients, whose 
general solution we can easily find by the methods of Chap. 2. 
Presumably it will be accompanied by given initial conditions 

2/(0) = yo and ^ | t=(J = Vo 

and, by using these, the constants in any complete solution can be 
determined. However, before continuing with the solution of 
Eq. (7) we shall derive the equations governing the other systems 
shown in Fig. 5.1. 

The analysis of the system of Fig. 5.1fo is based upon New- 
ton’s law in torsional form: 

(8) Moment of inertia X angular acceleration = torque 
In this case the various torques are 

(9) Elastic torque due to the twisting of the shaft = — kd 

(10) Viscous damping torque == — 

(11) Impressed torque = To cos u>t 

Since the angular acceleration is d 2 0/dt 2 , on substituting into 
Newton’s law, Eq. (8), we have 

r d 2 0 , „ dd . m 
/ _ — kd — C 37 + To cos ut 

ar at 

or 

(12) I ~~ + c ~ + kd = To cos coi 

This, too, is a completely familiar differential equation, and, 
when accompanied by the initial conditions 

0(0) = 0 O and ^~| <=o = 0'o 


148 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


it can easily be solved for the function describing the behavior of 
any particular system. 

The analysis of the series, or one-loop, electrical circuit 
shown in Fig. 5.1c is based on KirchhofPs second law: The 
algebraic sum of the potential differences around any closed loop is 
zero , or the voltage impressed on a closed loop is equal to the sum of 
the voltage drops in the rest of the loop. Using well-known electrical 
laws, we have 

(13) Voltage drop across the resistance = iR 

(14) Voltage drop across the condenser = — J* i dt 

di 

(15) Voltage drop across the inductance = L ^ 

Thus, assuming the important case in which 

(16) Impressed voltage = E a cos cat 

on substituting from Eqs. (13) to (16) into Kirchhoff’s second 
law, we have 

(1 7 ) L ~ + iR + i j l i dt = E 0 cos tat 

Strictly speaking, this is not a differential equation, but 
rather an integro differential equation. The operational methods 
we shall develop in Chap. 7 will handle it directly, but, before we 
can apply the techniques we have available at this stage, we must 
convert it into a pure differential equation. There are two ways of 
doing this. The first is to regard not i but j* i dt as the dependent 
variable of the problem. This is not merely a mathematical 
strategem, for the quantity 

Q - pi dt 

that is, the integrated flow of current into the condenser, is pre- 
cisely the quantity of electricity, or electric charge, instantane- 
ously present on the condenser. In terms of Q, then, we have the 
equation 

(18a) L -~~£ -f R ~ 4- ~ Q = E a cos cat 

subject, of course, to the given initial conditions 

0(0) s f l ~° i dt = Q 0 and ^ | i=o = i(0) = i 0 

On the other hand, we can also convert Eq. (17) into a differential 
equation simply by differentiating it with respect to time, getting 

(185) L ~~ -f- R -f ^ % = — wEq sin tat 


SEC. 5.2 


SYSTEMS WITH ONE DEGREE OF FREEDOM 


149 


(19) 

(20) 
( 21 ) 

( 22 ) 

(23) 


(24a) 


The initial conditions required for an equation of this form are 

i(Q) = H and ~ I = fo 
at |i=o 

The first of these was given for the original equation. The second 
can be found from the original equation, since 

os-tf-iB-g/'i*) 

and the right-hand side is completely known at t — 0. 

To establish the differential equation describing the behavior 
of the parallel, or one-node-pair, electrical circuit shown in Fig. 
5. Id, we must use Kirchhoff’s first law: The algebraic sum of the 
currents flowing toward any point in an electrical circuit is zero. 
Solving for i in Eqs. (13), (14), and (15) we obtain, respectively, 


Current through the resistance = 4 
h 

d/6 

Current (apparently) through the condenser = C 

Current through the inductance — •jr J* edt 

Thus, assuming the important case of a current source such that 
Impressed current — I 0 cos wt 

on substituting from Eqs. (19) to (22) into Kirchhoff’s first law, 
we have 

^ ~dt "R ^ ~L I ® ^ = 


Again, our derivation has led to an integrodifferential equa- 
tion. To convert it to a pure differential equation we can consider 
f e dt — ZJ, say, as a new variable, getting 


r d?U 1 dU 
0 dt 2 + R dt 


subject to initial conditions of the form 
£7(0) s J t= ° edt — Uo and ~ | <={) - 

On the other hand, we can simply differentiate Eq. (23) with 
respect to time, getting 
„ d 2 e . 1 de , 1 r . 

C df< + Bdi + L e= - Aslnat 


( 246 ) 


subject to initial conditions of the form 
e(0) = e 0 and %\ t=Q “ e '° 



150 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


When we collect the differential equations we have derived, 
~Ji + ky = F 0 cos «< (translational-mechanical) 


(series-electrical) 


(7) 


(12) 

/ ^+4 6+M - :r ° cos “ i 

(18a) 

L W + R Tt + t~ E ‘ mswt 

(186) 


(24a) 

„ d‘V , 1 iU . U , 

C W + R!tt + L- I ‘ Qosat 

(246) 

~ d 2 e , 1 de . e r . , 

C dT> +«*+£“ 


(parallel-electrical) 


their essential mathematical identity becomes apparent. More- 
over, we can see the possibility of various physical analogies. For 
instance, if we compare the translational-mechanical and the 
series-electrical systems, we find that 

Mass - inductance L 
9 

Friction c resistance R 
Spring modulus k «-> elastance 4 


Impressed force F 
Displacement y 


{ impressed voltage E [using Eq. (18a)] 
dE/dt [using Eq. (186)] 
charge Q [using Eq. (18a)] 
current i [using Eq. (186)] 


and, if we compare the translational-mechanical and the parallel- 
electrical systems, we have the correspondences 

Mass — *-* capacitance C 

Friction c «-* conductance 4 
JtC 

Spring modulus k <-> susceptance 4 
h 


Impressed force F * 
Displacement y 


( impressed current I [using Eq. (24a)] 
dl/dt [using Eq. (246)] 

J l e dt [using Eq. (24a)] 
voltage e [using Eq. (246)] 


We shall not pursue these analogies further. Instead we shall 
investigate one or two of the systems in detail. 


SEC. 5.3 


THE TRANSLATIONAL-MECHANICAL SYSTEM 


151 


5.3 

The translational-mechanical system 


( 1 ) 


The displacement y of the weight w in the translational-mechan- 
ical system (Fig. 5.1a) has been shown to satisfy the differential 
equation 


w (Py 
g dF 


dy 


dt 


+ ky — F o cos ut 


Following the general theory of Chap. 2, it must therefore consist 
of two parts. The complementary function, obtained by solving 
Eq. (1) when the term representing the impressed force is deleted, 
describes the motion of the weight in the absence of any external 
disturbance. This intrinsic, or natural, behavior of the system is 
called the free motion. The particular integral describes the 
response of the system to a specific influence external to the 
system. The behavior it represents is called the forced motion. 

The nature of the free motion of the system will depend upon 
the roots of the characteristic equation 


~ m 2 + cm + k — 0 
<7 

namely, m - - 3L ± -*>% 

Since g, w, and k are all positive and c is nonnegative, and since 
the radical, when real, is certainly less than c, it follows that the 
real parts of the roots Wi and m 2 are always negative. We must 
now consider three possibilities: 


c* 


4 kw 
0 


> 0 
= 0 
< 0 


In the first case, c 2 — 4 kw/g > 0, there is a relatively large 
amount of friction, and, naturally enough, the system is said to be 
overdamped. The free motion, i.e., the motion described by the 
complementary function, is now given by the expression 


y = Ae m>l + Be mt * 


where, as we pointed out above, both m i and m 2 are negative. 
Thus y approaches aero as time increases indefinitely. This, of 
course, is perfectly consistent with the familiar observation that 
if a system upon which no external forces are acting is displaced 
from its equilibrium position, it will eventually return to that 
position as friction causes the disturbance to subside. 

If we set y — 0, we obtain the equation 

A e m,t _|_ J 5 e ms t = 0 or A 7 ^ 0 


152 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


If A and B, which will, of course, be determined by the initial 
conditions of the problem, are of opposite sign, then there is one 
and only one value of t which satisfies the last equation. On the 
other hand, since a real exponential function must always be 
positive, it follows that when A and B have the same sign or when 
one or the other of them is zero, there is no time when y ~ 0. A 
plot of the displacement y during the free motion of an over- 
damped system must therefore resemble one of the curves shown 
in Fig. 5.2 or the reflection of one of these curves in the £-axis. 
Figure 5.2a, b, and c illustrates the possibilities when A and B are 
of opposite sign and y vanishes once and only once. Assuming 
that the weight starts its motion when t = 0, the zero of y may, of 
course, occur in the physically irrelevant interval « < t < 0. 
Figure 5. 2d illustrates both the case when A and B are of like 
sign and the case when either A or B is zero and y can never 
vanish. 



9 


we have the border-line case in which the roots of the character- 
istic equation are real and equal: 


mi = m s 


£ 9 _ 

2w 


When this occurs, the motion is said to be critically damped, and 
the exact value of the damping which produces it, namely, 



is known as the critical damping. In this case the free motion is 
given by 

y = Ae m ' 1 -f Ble m ^ 

If we set y = 0, we obtain 
Ae mit + Bte m ^ = 0 or £ = — 4 B ^0 

If B — 0, there is no value of t for which y — 0, but in all other 
eases there is one and only one value of t for which y - 0. This 
may be in the physically irrelevant interval — w < t < 0, how- 
ever, and so it is possible that y will not vanish in the actual 
motion even when B 0. Clearly, there is no essential difference 



(a) ( b ) (c) (d) 


FIGURE 5.2 

Displacement-time plots for free overdamped and critically damped motion. 



SEC. 5.3 


THE TRANSLATIONAL-MECHANICAL SYSTEM 


153 


(3) 

(4a) 

(4b) 

(4c) 


in the character of the motion in the overdamped and critically 
damped cases, and the possible plots of the displacement y in the 
critically damped case are also represented by the curves of Fig. 
5.2 and their reflections in the f-axis. 

If c 2 — (4 kw/g) < 0, the motion is said to be underdamped. 
The roots of the characteristic equation in this case are the con- 
jugate complex numbers 


mi,m 2 = 

where 


■ SL 

2w 


■J 


14 kw 
9 


-p ± iq 


and 


. JL j ^ cw _ 

’ 2w V g C 


The free motion is therefore described by 


y = e~ pt (A cos qt + B sin qt) 
or equally well by 
y ~ Ge~ vl cos ( qt — H) 
or by 


y — Ke~ pi sin (qt — L) 

where A, B,G, H, K, and L are arbitrary constants. 

The motion described by either (4a), (4b), or (4c) is known 
as a damped oscillation, and its general appearance is shown in 
Fig. 5.3. It is not periodic, since the factor multiplying the trigo- 
nometric terms is continuously decreasing. However, there are 
regularly spaced passages through the equilibrium position at 
intervals of ir/q. In fact, using the description of the motion pro- 
vided by Eq. (4b), it is clear that y - 0 whenever 
cos (qt — H) - 0 

that is, when qt — H — 7r/2 + nr or 

*-K* + S + t 




154 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 



Hence, we can speak of the pseudo period 2ir/q and of the pseudo 
frequency or “frequency with damping” w d) defined by the equation 


‘ .isr. 

o\g 2 t V w 


"cycles ’’/unit time 


If c = 0, that is, if there is no damping in the system, the 
motion is strictly periodic, and its frequency, which we shall call 
the undamped natural frequency is, from (5), given by the 
formula 


cycte/unittime 


Clearly the "frequency” when damping is present is always less 
than the undamped natural frequency. The ratio of the two fre- 
quencies is 

aa __ s/kg/w cy/4u> 2 
w » ■\Jhgjw 


since, from Eq. (2), c e 2 — 41m/ g. Figure 5.4 shows a plot of 
wrf/wiv versus c/'c c . Evidently, if the actual damping is only a small 
fraction of the critical damping, as it often is, its effect upon the __ 
frequency of the motion is very small. This explains why friction 
is usually neglected in natural-frequency calculations. 


FIGURE 5.4 
Hot showing the 
effect of friction 
on frequency in 
an underdamped 
system. 



Still using Eq. (46), it is clear that the extreme values of y 
occur when 

^ — G [— pe~ pt cos (qt — H) — qe~ pt sin (qt — H)\ = 0 
that is, when tan (qt — H) ~ —p/q, or, finally, when 

q q q q q 

where T denotes the constant (H/q) ~ (l/q) Tan” 1 (p/q). 

The ratio of successive extreme displacements on the same 
side of the equilibrium position is a quantity of considerable 



SEC. 5.3 


THE TRANSLATIONAL-MECHANICAL SYSTEM 


(7) 


importance. Its value is 


V. _ y(T + m/ q ) cos 

[«(*■ + ?-)-< 


y, + , y[T+(n + 2),/,] g ^ Ma „ q) ^ 

? ( r + i±2,)_ 



= e ^ cos + n7r - 

COS (qT + nir — H + 2ir) 


s e^vpiQ 

Since this result depends only on the parameters of the system 
and not on n, we have thus established the following remarkable 
result : 


THEOREM 1 


The ratio of successive maximum (or minimum) displacements remains constant 
throughout the entire free motion of an underdamped system. 

If we take the natural logarithm of Expression (7), we have 

tin _ 2irp 


(8) 


ln- 


This quantity is known as the logarithmic decrement 8, and it is a 
convenient measure, in nepers per cycle, of the rate at which the 
motion dies away.* Substituting for p and q from (3) into (8), we 
find 


2ir cg/2w = 9nr c 

? (g/2w) V (4Jcw/g) — c- ^ s/c? - c 1 

Solved for c/c c> this becomes 


c ° •sj 5 2 -f- 47 r 2 


Since y n and y n +z are quantities relatively easy to measure, 5 can 
easily be computed. Then from Eq. (9) the fraction of critical 
damping present in a given system can be found at once. 

Now that we have investigated the free motion of the transla- 
tional-mechanical system in the overdamped, critically damped, 
and underdamped cases, it remains for us to consider the forced 
motion. To do this we must, of course, find a particular integral 
forEq. (1): 

<» =$ + «§ + *-*»' - 


Assuming, as usual, 

Y — A cos cot 4~ B sin 


* Equivalently, though less conventionally, the rate of attenuation could be 
expressed in decibels per cycle by means of the definition 

Decibels = 20 log 

Vn+ a 


156 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


and substituting into (1), collecting terms, and equating to zero 
the coefficients of cos cot and sin cot, we obtain the two conditions 


(*-;) 


A + « cB =» Fo 


from which we find immediately 

a - o> 2 (Wg) p 

+ 0 

R «c j? 

~ [k - +■ («C) 2 


[ft ~ o) 2 (w/g)] cos wt -4- (wc) sin cot 

0 [ft - <o 2 (w/g)] 2 -f (wc) 2 

F* f ft ~ 


V[ft - “WsO ] 2 + (c oc)- (Vlft - + (wc ) 2 


V[k — co 2 (w/g )] 2 + M 2 | 

Now, referring to the triangle shown in Fig. 5.5, it is evident 
that Y can be written in either of the equivalent forms 

Y — (COS Oil COS « + Sm (lit SUl a) 

V[ft - o>\w/g)] 2 + (wc) 2 

= -— ==^a^=' =; ^’l . COS (wi t — a) 


V[k - w 2 (u>/g )] 2 + (wc ) 2 

7 « — ^= = r = (cos sin 0 + sin ut cos jS) 

V [ft — w 2 (w/p)3 2 -j- (wc) 2 

V[ft - w 2 (ta /^)] 2 + (wc ) 2 Sm ^ + ^ 

The first of these equations is the more convenient because it 
involves the same function (the cosine) as the excitation term in 
the differential equation. Hence, the phase relation between the 
response of the system and the disturbing force can easily be 
inferred. Accordingly, we shall continue with the first expression 
for Y. 


The triangle 
defining the 
phase angles 
appearing in Eqs. 
(10ff) and (10£>). 




SEC. 5.3 


THE TRANSLATIONAL-MECHANICAL SYSTEM 


157 


(ID 


If we divide the numerator and denominator by k and 
rearrange slightly, we obtain 
Fo/k 


Y = 


V[i - a + {uc/ky 

Fo/k 


cos (at — a) 


A' 


,.,2 \2 

-1 + 


2 c V 


cos (at — a) 


kg/ to ) \y/kgf w y/Akw/g) 


' y/(l - (co 2 /«n 2 )] 2 + (2(a/a n )(c/c c )] 


: cos (at — a) 


where 8 at = F 0 /k is the static deflection which a constant force of 
magnitude F o would produce in a spring of modulus k, and, as 
before, a n 2 = lcg/w and c c 2 = Akw/g. 

The quantity 


Vii - (<•*/».*)]* + [2(Wu.XcM]‘ 


is called the magnification ratio. It is the factor by which the 
static deflection produced in a spring of modulus k by a constant 
force Fo must be multiplied in order to give the amplitude of the 
vibrations which result when the same force acts dynamically 
with frequency «. Curves of the magnification ratio M plotted 
against the frequency ratio a/a n for various values of the damping 
ratio c/c c are shown in Fig. 5.6. An inspection of Fig. 5.6 reveals 
the following interesting facts: 

a M — I, regardless of the amount of damping, if a/a n — 0. 


i Magnification 
: ratio M 


FIGURE 5.6 
Curves of the 
magnification 
ratio M as a 
function of the 
impressed fre- 
quency ratio 
(o/to n for various 
amounts of 
damping. 




MECHANICAL AND ELECTRICAL CIRCUITS 



b If 0 < c/Cc < l/\/2, M rises to a maximum as «/«» increases 
from 0, the peak value of M occurring in all cases before the 
impressed frequency a reaches the undamped natural fre- 
quency cu„. 

c The smaller the amount of friction, the larger the maximum of 
M, until for conditions of undamped resonance, namely, 
cfc B — 0 and cu = infinite magnification, i.e., a response of 
infinite amplitude, occurs. 

d If c/cc ^ l/-\/2, the magnification ratio decreases steadily as 
w/m» increases from 0. 

e For all values of c/c e , M approaches zero as the impressed fre- 
quency is raised indefinitely above the undamped natural fre- 
quency of the system. 


The angle a = tan -1 7— — 57 — 7-r 0 25 a ^ ir 

& k - u 2 (w/g) 

which appears in Eq. (lOo) and is shown in Fig. 5.5, is known as 
the phase angle or angle of lag of the response. Like the magni- 
fication ratio, it, too, can easily be expressed in terms of the 
dimensionless parameters w/a>„ and c/c e . To do this we need only 
divide the numerator and denominator of the right-hand side of 
the last expression by k and rearrange slightly : 


tan-i (“/ V*g/«0 (2c/V4:kw/g) 
1 - «■/(*»/») 


= tan -1 


2(o)/to„) (c/c c ) 
1 — (w/w„) 2 


It is important to note that a is not to be read from the principal- 


value branch of the arctangent function, for it is evident from 
Fig. 5,5 that sin a is always positive, whereas cos a can be either 


positive or negative. Hence, a must be an angle between 0 and t 
and not an angle in the principal-value range (— 7r/2,7r/2). Plots 
of a versus the frequency ratio w/W for various values of the 
damping ratio c/c c are shown in Fig. 5.7. 

The physical significance of a is shown in Fig. 5.8. The dis- 
placement Y reaches its maxima «/w units of time after or later 


than the driving force reaches its corresponding peak values. 
When the frequency of the disturbing force is well below the 
undamped natural frequency of the system, a is small and the 
forced vibrations lag only slightly behind the driving force. When 
the impressed frequency is equal to the natural frequency, the 
response of the system lags the excitation by one-quarter of 
a cycle. As w increases indefinitely, the lag of the response 
approaches half a cycle, or, in other words, the response becomes 
180° out of phase with respect to the driving force. 

The results of our detailed study of the vibrating weight can 
now be summarized. The complete motion of the system consists 



SEC. 5.3 


THE TRANSLATIONAL-MECHANICAL SYSTEM 


159 



of two parts. The first is described by the complementary func- 
tion of the underlying differential equation and may be either 
oscillatory or nonoscillatory according as the amount of friction 
in the system is less than or more than the critical damping figure 
for the system. In any case, however, this part of the solution 
contains factors which decay exponentially and thus becomes 
vanishingly small in a very short time. For this reason it is known 
as the transient. The general expression for the transient contains 


FIGURE 5.8 
Plot showing the 
significance of 
the phase angle 
as a measure of 
the time by 
which the 
response lags the 
excitation in a 
mechanical 
system. 




158 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 



b If 0 < cfcc < l/y/2, M rises to a maximum as a/m increases 
from 0, the peak value of M occurring in all cases before the 
impressed frequency a reaches the undamped natural fre- 
quency 

c The smaller the amount of friction, the larger the maximum of 
M, until for conditions of undamped resonance, namely, 
c/cc — 0 and a = infinite magnification, i.e., a response of 
infinite amplitude, occurs. 

d If c/c c =§ 1/a/ 2> the magnification ratio decreases steadily as 
a/m increases from 0. 

e For all values of c/c C} M approaches zero as the impressed fre- 
quency is raised indefinitely above the undamped natural fre- 
quency of the system. 

The angle a = tan -1 j- — — j-r 0 a v 

k- a\w/g) 

which appears in Eq. (10a) and is shown in Fig, 5.5, is known as 
the phase angle or angle of lag of the response. Like the magni- 
fication ratio, it, too, can easily be expressed in terms of the 
dimensionless parameters a/a n and c/c c . To do this we need only 
divide the numerator and denominator of the right-hand side of 
the last expression by k and rearrange slightly: 

, , ac/k 

a = tan~ l -z 

1 - a 2 (w/kg) 

- tnn-» (u/VW/w) (2c/ V 4kw/g) 

" 1 - a*/(kg/w) 

_ 2 (a/a n Kc/c c ) 

1 - ( a/a n ) n ‘ 

It is important to note that a is not to be read from the principal- 
value branch of the arctangent function, for it is evident from 
Fig. 5.5 that sin a is always positive, whereas cos a can be either 
positive or negative. Hence, a must be an angle between 0 and r 
and not an angle in the principal-value range (~tt/2,t/2). Plots 
of a versus the frequency ratio a/a n for various values of the 
damping ratio c/c e are shown in Fig. 5.7. 

The physical significance of a is shown in Fig. 5.8. The dis- 
placement Y reaches its maxima a/a units of time after or later 
than the driving force reaches its corresponding peak values. 
When the frequency of the disturbing force is well below the 
undamped natural frequency of the system, a is small and the 
forced vibrations lag only slightly behind the driving force. When 
the impressed frequency is equal to the natural frequency, the 
response of the system lags the excitation by one-quarter of 
a cycle. As a increases indefinitely, the lag of the response 
approaches half a cycle, or, in other words, the response becomes 
180° out of phase with respect to the driving force. 

The results of our detailed study of the vibrating weight can 
now be summarized. The complete motion of the system consists 


SEC. 5.3 


the translational-mechanical system 


159 


FIGURE 5.7 
Curves of the 
phase angle a as 
a function of the 
impressed fre- 
quency ratio 
&)/w n for various 
amounts of 
damping. 



of two parts. The first is described by the complementary func- 
tion of the underlying differential equation and may be either 
oscillatory or nonoscillatory according as the amount of friction 
in the system is less than or more than the critical damping figure 
for the system. In any case, however, this part of the solution 
contains factors which decay exponentially and thus becomes 
vanishingly small in a very short time. For this reason it is known 
as the transient. The general expression for the transient contains 


FIGURE 5.8 
Plot showing the 
significance of 
the phase angle 
as a measure of 
the time by 
which the 
response lags the 
excitation in a 
mechanical 



Forced displacement 

Y- (h st M) cos (wf -il^L 1 


MECHANICAL AND ELECTRICAL CIRCUITS 


two arbitrary constants, which, after the complete solution has 
been constructed, must be determined to fit the initial conditions 
of displacement and velocity. The second part of the solution is 
described by the particular integral. In the highly impoi’tant case 
in which the system is acted upon by a pure harmonic disturbing 
force (we considered only F = F o cos cot, but without exception 
all our conclusions are equally valid for F = Fo sin cot), this term 
represents a harmonic displacement of the same frequency as the 
excitation but lagging behind the latter. The amplitude of this 
displacement is a definite multiple of the steady deflection which 
would be produced in the system by a constant force of the same 
magnitude as the actual, alternating force. This factor of magni- 
fication, like the amount of lag, depends solely on the amount of 
friction in the system and on the ratio of the impressed frequency 
to the undamped natural frequency of the system. The particular 
integral does not decay as time goes on but continues indefinitely 
in the same pattern. For this reason it is known as the steady 
state. 


A 50-lb weight is suspended from a spring of modulus 20 lb /in. When the system is vibrating 
freely, it is observed that in consecutive cycles the maximum displacement decreases by 40 per 
cent. If a force equal to 10 cos ut acts upon the system, find the amplitude of the resultant 
steady-state motion if (a) a = 6, (b) « — 12, and (c) w — 18 rad/sec. 

The first step here is to determine the amount of damping present in the system. From the 
given data it is clear that 

J/n+2 =* 0.60t In 

and, thus, that 5 = ln-^- = In = 0.511 

Vn+3 0.60 


Hence, by Eq. (9), 


c ‘ \/t> 2 + 4jt* \/( 0.511)* + 4*-* 

Next we must compute the undamped natural frequency of the system. Using Eq. (6), 
we have 


--a/!-# 


Knowing c/c 6 and we can now use Eq. (11) to compute the magnification ratio for 
w = 6, 12, and 18. Direct substitution gives the values 


- 

6 

12 

18 

M 

■ 1.30 

5.94 

0.88 


Finally, it is clear that a 10-Ib force, acting statically, will stretch a spring of modulus 
20 lb /in. a distance 


=■ — 0.5 in. 


SEC. 5.3 


THE TRANSLATIONAL-MECHANICAL SYSTEM 


161 


Hence, multiplying this static deflection by the appropriate values of the magnification ratio, 
we find, for the amplitude A of the steady-state motion, the values 


" 

6 

12 

18 

A 

0.65 

2.97 

0.44 


The amplitude corresponding to the impressed frequency to = 12 is much larger than either of 
the others because this frequency very nearly coincides with the natural frequency of the sys- 
tem, co„ = 12.4. 


EXAMPLE 2 

A system containing a negligible amount of damping is disturbed from its equilibrium position 
by the sudden application at £ = 0 of a force equal to F 0 sin tot. Discuss the subsequent motion 
of the system if w is close to the natural frequency w„. 

The differential equation to be solved here is 


w dry . 

-,w +klJ 


F q sin cot 


The complementary function is, clearly, 



and it is easy to verify that a particular integral is 


Y 


k - <o 2 (w/g) 

Hence, recalling that <o n — s/ kg/w, a complete solution can be written: 

„ . , F 0 g sin cot 

y — A cos co n t + B sm to n l -I — . - 


Since y = 0 when t = 0, we must have A = 0, leaving 


, . Fog sin « t 

(13) y - B sm co n t -| j 


Substituting v = 0 and t = 0 in the last equation, we obtain 


or B 


Fpgto 


W(tO„ 2 - to 2 ) W^tOn 2 - C0 2 )h) n 

Hence, substituting into (13), we find for the required solution 


— — | — — sin co n t + sin cot) 

o>„ 2 ~ « 2 ) \ “» / 


If the impressed frequency to is very close to the natural frequency co ni we can for descrip- 
tive purposes set to/to„ — 1 in the last expression, obtaining 

F 0 g sin co„t — sin tot 


— w„ 2 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


162 


If we convert the difference of the sine terras into a product, we get 




If we now denote the small quantity « — by 
to 2 co, we can write 


V 


F*g 

w 


sin d 
2wc 


COS cot 


and note that w + 


is approximately equal 


Since « is a small quantity, the period 2 t/« of the term sin rf is large. Hence, the form of the last 
expression show's that y can be regarded as essentially a periodic function cos tat of frequency <a, 
with slowly varying amplitude 

Fog sin d 
w 2toe 


Figure 5.9 show's the general nature of this behavior when ca is nearly but not quite equal to u B 
and in the limiting case when to = ta„ and conditions of pure resonance exist. 

This is one of the simplest illustrations of the phenomenon of beats, w’hich occurs w'henever 
an impressed frequency is close to a natural frequency of a system or whenever tw r o slightly 
different frequencies are impressed upon a system regardless of what its natural frequencies 
may be. A wave form of variable amplitude, such as that shown in Fig. 5.9a, is said to be ampli- 
tude-modulated, and the lighter curves to which the actual w'ave periodically rises and falls are 
called its envelope. 


FIGURE 5.9 
Plot illustrating 
the phenomenon 
of beats. 




SEC. 5.3 


THE TRANSLATIONAL-MECHANICAL SYSTEM 


163 


1 


8 


4 


5 


6 


7 


8 


9 


10 


11 


12 


EXERCISES 

If friction is neglected, show that the natural frequency of a system consisting of a mass on 
an elastic suspension is approximately 3.13/^/ s 4{ cycles/sec, where S, t is the deflection, in 
inches, produced in the suspension when the mass hangs in static equilibrium. 

A heavy motor of unknown weight is set upon a felt mounting pad of unknown spring 
constant. What is the natural frequency of the system if the motor is observed to compress 
the pad He in.? 

Prove that the logarithmic decrement 5 is equal to the natural logarithm of the ratio of any 
nonzero displacement to the displacement one full cycle later. 

Show that the logarithmic decrement 5 can also be computed from the formula 

1 Vn 

8 = -i n _£_ k = l, 2, 3, . . . 

to Vn+2k 

For a given value of c/c C) determine the minimum number of cycles required to produce a 
reduction of at least 50 per cent in the amplitude of a damped oscillation. 

If c/cc is small, show that the logarithmic decrement is approximately 


Show that the energy dissipated during the nth cycle of a damped oscillation is equal to 
(ft/2)(y„ I 2 — y n+ 2 s ). Hence, using the result of Exercise 6, show that, when c/c c is small, 
the energy loss during the nth cycle is approximately 5. 

If the roots of the characteristic equation in the overdamped case are m — —r ± s, show 
that in general the complementary function can be written as y = Ae~ H cosh (st + B) or 
as y — Ce~ n sinh (st + D), according as it has no real zero or one real zero. Are there any 
exceptions? 

If 2/0 and v 0 are, respectively, the initial displacement and initial velocity with which an 
overdamped system begins its motion, show that 


w/v 
a \yo ) 


+ c — + k > 0 
Vo 


is the condition that the complementary function have a real zero. 

In addition to the condition of Exercise 9, what further requirement is necessary to ensure 
that the zero of the complementary function will be positive, i.e., will occur during the 
actual motion? 

An overdamped system begins to move from its equilibrium position with velocity v a . 
Show that its maximum displacement occurs when 


I 

«» V (c/e c ) 2 - 1 




(Hint: Use the results of Exercise 8.) 

In Exercise 11, show that the maximum displacement is 


Wx 



where 


Investigate the answers to Exercises 11 and 12 in the limit when c/c c approaches 1. Check 
your results by working directly with the equation for the transient in the critically damped 
case. 


13 


164 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


14 Show that the maximum displacements during the free motion of an underdamped system 
do not occur midway between the zeros of the displacement, but precede the mid-points 
by the constant amount 

Sin -1 (c/c c ) 

«„ Vl ~ (c/cc) ! 

15 Investigate the motion of a weight hanging on a spring when the disturbing force is equal 
to Fo sin wi instead of F<> cos a it. In particular, show' that Eqs. (11) and (12) for the magnifi- 
cation ratio and phase angle, respectively, are still the same. 

16 Show that the maxima of the curves of the magnification ratio versus frequency ratio 
occur when 



17 A weight of 128 lb hangs from a spring of modulus 75 lb/in. The damping in the system is 
28 per cent of critical. Determine the motion of the weight if it is pulled downward 2 in. 
from its equilibrium position and suddenly released. 

18 A weight of 96 lb hangs from a spring of modulus 25 lb/in. The damping in the system is 
60 per cent of critical. Determine the motion of the weight if it is pulled downward 1 in. 
from its equilibrium position and released with an upward velocity of 2 in. /sec. 

19 Solve Exercise 18 if a constant force of 50 lb is suddenly applied to the system when it is 
at rest in its equilibrium position. 

20 A weight of 54 lb hangs from a spring of modulus 36 lb/in. During the free motion of the 
system it is observed that the maximum displacement of the weight decreases to one-tenth 
of its value in five complete cycles of the motion. Find the amplitude of the steady-state 
motion produced by a force equal to 6 sin 15< lb. By what time interval does this steady- 
state motion lag the driving force? 

21 A uniform bar of length l and weight w rests on two horizontal rollers whose axes are parallel 
and which rotate inwardly in opposition to each other with constant angular velocity. 
Friction between the bar and each roller is assumed to be “dry,” or Coulomb; that is, it is 
proportional to the normal force between the bar and the roller, the proportionality con- 
stant being the so-called coefficient of friction /x. When the bar, which always remains in a 
line perpendicular to the axes of the rollers, is displaced slightly from a symmetrical posi- 
tion, it executes small oscillations in the horizontal direction. Determine the period of this 
motion, and show how the value of n can thus be found experimentally. 

22 In many applications involving forces arising from rotating parts which have become unbal- 
anced, the amplitude of the sinusoidal disturbing force acting on a system is not constant, 
but varies directly as the square of the frequency. If a weight suspended from a spring 
is acted upon by a force of this character, determine the steady-state motion. In particular, 
determine the form of the magnification ratio and the formula for the angle of lag. 

23 Show that the maxima on the plots of the magnification ratio versus the frequency ratio 
under the conditions of Exercise 22 always occur at values of the impressed frequency w 
greater than the natural frequency of the system 

24 A particle of weight w moves in a horizontal line under the influence of an elastic force 
equal to —kx, where x is the distance of the particle from the origin. Friction in the system 
is assumed to be dry rather than viscous; that is, it is proportional to the normal force 
between the particle and the surface on which it moves, and does not depend on the velocity. 
Show that the motion of the system is described by the differential equations 

— x + kx = iiv> when the particle is moving to the left 


— £ + kx = tiw when the particle is moving to the right 


SEC. 5,4 


THE SERIES-ELECTRICAL CIRCUIT 


165 


If the body starts from rest at the point x = a; 0 , find x as a function of t. What is the de- 
crease in amplitude per cycle? When will the body come to rest? 

26 A system is acted upon by two forces 

Fi = A\ sin ant and F t — A« sin « 2 f 

Friction, though present in the system, is so small that it can be neglected in determining 
the forced motion. Discuss the steady-state behavior of the system if an and « 2 are nearly 
equal but if neither is close to the natural frequency of the system. In particular, show that 
the response consists of a term of frequency (mi + M 2)/2 whose amplitude is modulated by 
a term of frequency (on — m 2 )/2, and determine the limits between which the amplitude 
varies. Hint: After the particular integrals have been determined, note that the expression 
Ki sin ait + K-, sin w>t can be written 

(sin «i t + sin m» 0 + — 1 (sin u\t — sin u-J) 


The series-electrical circuit 

All the results we obtained in the last section can, after a suitable 
change in terminology, be applied to any of the other systems we 
have considered. However, the concepts central in one field are 
not always of equal importance in related fields, and it seems 
desirable to illustrate the minor differences in the application of 
our general theory to various classes of systems by considering 
one of the electrical circuits in some detail. 

For the simple series circuit with an alternating impressed 
voltage, we derived (among several equivalent forms) the 
equation 

(1) + + £«-*.«»* 

and on comparing this with the differential equation of the 
vibrating weight 

we noted the correspondences 
Mass — «-*• inductance L 

g 

Friction c resistance R 
Spring modulus k elastance ^ 

Impressed force F «-> impressed voltage E 
Displacement y chai*ge Q 
Velocity v current i 

Extending this correspondence to the derived results by 
making the appropriate substitutions, we infer from the un- 


166 


MECHANICAL AND ELECTRICAL CIRCUITS • 


CHAP. 5 


( 2 ) 


( 3 ) 


damped natural frequency of the mechanical system 

[kg 
~ \ hr 

\ w 

that the electrical circuit has a natural frequency 


D n 


4 

i n 
cal i 

4 


R, 


when no resistance is present. Furthermore, the concept of 
critical damping 
lUw 
ff 

leads to the concept of critical resistance 

u 

c 

which determines whether the free behavior of the electrical 
system will be oscillatory or nonoscillatory. 

The notion of magnification ratio can also be extended to 
the electrical case, but it is not customary to do so because the 
extension would relate to Q (the analogue of the displacement y), 
whereas in most electrical problems it is not Q but i which is the 
variable of interest. To see how a related concept arises in the 
electrical case, let us convert the particular integral Y given by 
Eq. (106), Sec. 5.3, into its electrical equivalent. By direct substi- 
tution the result is found to be 


E o sin {at + 0) 


1 = Tan - 


i (1/C) - o»»L 


VK1/C) - «*£]* + M) 2 " " wE 

To obtain the current i, we differentiate this, getting 
dQ _ . Eqci) cos (wi + 0) 

~dt V[(l /C) - u'-LY + M) 2 

or, dividing numerator and denominator by w in the expressions 
for both i and 0, 

. _ Eq COS (o ) t — S) 


VR 2 + [«L - 

where 

o = — 0 — Tan" 


(1/«C)] 2 


, «L - (1/«C) 
R 


From Eq. (2) we infer that the steady-state current produced 
by an alternating voltage is of the same frequency as the voltage, 
but differs from it in phase by 


units of time or 


2 tt 


cycles 


Moreover, from Eq. (3) it is clear that the numerator of tan 5 
(which is proportional to sin 5) can be either positive or negative, 


SEC. 5.4 


THE SERIES-ELECTRICAL CIRCUIT 


167 


whereas the denominator of tan 5 (which is proportional to cos 5) 
is always positive. Hence 5 must be an angle between — ir/2 and 
7r/2, and so the principal-value designation in Eq. (3) is appropri- 
ate. If 5 is positive, the steady-state current lags the voltage; if S 
is negative, the steady-state current leads the voltage. 

Furthermore, from Eq. (2) we see that the amplitude of the 
steady-state current is obtained by dividing the amplitude of the 
impressed voltage Eq by the expression 

(4) V fl! +(" L -^y 

By analogy with Ohm’s law, I — E/R, the quantity (4) thus 
appears as a generalized resistance, although it is actually called 
the impedance of the circuit. While not the analogue of the 
magnification ratio, the impedance is clearly a similar concept. 
Since impedance is defined as 

Voltage 

Current 

the mechanical quantity corresponding to this is the ratio 
Force 
Velocity 

This is called the mechanical impedance by some writers and in 
certain mechanical problems has proved a useful notion. 

There is another approach to the problem of determining the 
steady-state current produced by a harmonic voltage that is well 
worth investigating. Suppose that, given either E - Eq cos cot or 
E = Eq sin ojt, we write the basic differential equation (1) in the 
form 

(5) L ?L<t + R dQ + l Q = EoeM = # o(cos ut + j sin at) f 

This includes both possibilities for the voltage, and, if the real 
and the imaginary terms retain their identity throughout the 
analysis, then the real part of the particular integral correspond- 
ing to E 0 e iae will be the particular integral for E 0 cos u>t, and the 
imaginary part will be the particular integral for E a sin at. 

To see that this is actually the case, we must first find a 
particular integral of Eq. (5). As usual, we do this by assuming 

Q = A& at 

and substituting into the differential equation. This gives 
L(—a3 2 Ae 3 ' a 0 + R(juAe iat ) -f- ~ (Ae ia 0 = E 0 e jai 


t To avoid confusing i — - s/ — 1 with i = current, we shall throughout the 
rest of this chapter follow the standard practice of writing -\/—l - j. 


168 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


FIGURE 5.10 
Plot showing the 
relations among 
the magnitude, 
angle, and com- 
ponents of a 
general complex 
number a + jb. 


which will be an identity if and only if 

A = -»*L +>« + (1/C) 

Hence Q ^ w^+am e *“ 

From this, by differentiation, we find that 

dQ _ > _ juE o jwl _ 

dt ~ 1 ~ jaR - a*L + ( l/C) e “ R + j[aL - (l/wC)] 


To find the real and imaginary parts of this expression, it is 
convenient to use the fact (Sec. 14.7) that any complex number 
a -f jb can be written in the form a + jb — re jS , where the magni- 
tude r and the angle S of the complex number are related to the 
components a and b as shown in Fig. 5.10. Applied to the denomi- 
nator of the second expression for i, this gives 

R +( (•* - a?) - >/** + (•* -a?)’ •• 

where t - Ten 

lb 


Hence we can rewrite i in the form 

i — - — - — e iat 

VR- + [uL - (1/coC)] 2 e* 

== — - - ff° — 

VS 2 + [uL - (1 /aC)] 2 
— F cos — S) + j sin (c d — S) 

° VR 2 + [uL - (l/coC)] 2_ 

Comparing this with Eqs. (2) and (3), it is clear that the real part 
here is exactly the particular integral corresponding to E 0 cos at, 
as we derived it directly. Similarly, had we taken the trouble to 
work it out explicitly, we would have found for the particular 
integral corresponding to E 0 sin at precisely the imaginary part 
of the last expression. Since it is much easier to find the particular 
integral corresponding to an exponential term than it is to find 
the particular integral for a cosine or sine term, the advantage of 
using Z? 0 e jW in place of E 0 cos at or E 0 sin at is obvious. 




170 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


the impedance concept is generalized through the use of the 
Laplace transformation (Chap. 7). 

EXAMPLE 1 

A series circuit in which both the charge and the current are initially zero contains the elements 
L = 1, R = 1,000, C = 6.25 X 10 -s . If a constant voltage E = 24 is suddenly switched into 
the circuit, find the peak value of the resultant current. 

The differential equation we must solve is 

Q 


d?Q dQ 

It* + 1,000 * + e^rxlo-i 


= 24 


subject to the conditions that Q = i — 0 when t = 0. The characteristic equation in this case is 
m* + 1,000m, + 160,000 = 0 

and its roots are mi == —200, m 3 = — 800. Hence, the complementary function is 
Cie -iooi -|_ ae~ sm 

To find a particular integral, we assume Q — A and substitute into the differential equation, 
without difficulty getting 

A = 150 X 10~« 

The complete solution is, therefore, 

Q = cie _a0B1 + ae~ Bm + 150 X 10' 6 
and, differentiating, 


dQ 
dt " 


■■ — 200cie -a0M - S00c 2 e~ 800 ' 


Substituting the initial conditions for Q and i gives the pair of equations 
Ci + Ci + 150 X 10 -6 = 0 and c t + 4c a = 0 
from which we find at once 


ci = -200 X 10" s c 2 = 50 X 10"« 

Hence, i == 0.04(e~ 200i - e~ aoo< ) 

, To find the time when i is a maximum, we must equate to zero the time derivative of i: 
0.04( — 200e -200( + 800e~ 80M ) = 0 

Dividing out 0.04 X 800e~ 500< and transposing, we have e~ 600 ‘ — and, taking logarithms, 
t = 0.0023 sec. 

The maximum value of i can now be found by substituting this value of t into the general 
expression for i. The result is 

finax = 0.019 amp 


EXERCISES 

1 In Example 1, find the potential difference across each element as a function of time, 

2 An open series circuit contains the elements L = 0.01, R = 250, C = 10" 6 . At f = 0, with 
the condenser charged to the value Q 0 = 10~ 5 , the circuit is closed. Find the resultant cur- 
rent as a function of time. 

Work Exercise 2, given that the circuit elements are L = 6.4 X 10” 3 , R - 1.6 X 10 s , 
C = io-«. 


3 


SEC 5.5 


SYSTEMS WITH SEVERAL DEGREES OF FREEDOM 


in 


4 Work Exercise 2, given that the circuit elements are L = 0.01, R = 120, C = 10 -6 . 

5 A voltage E = 120 cos 1207r< is suddenly switched into a series circuit containing the ele- 
ments L = 1, R = 800, C = 4 X 10~ 6 . What is the resultant steady-state current? 

6 A series circuit in which Q 0 = i 0 = 0 contains the elements L = 0.15, R = 800, 

C — 4 X 10 _<i . If a constant voltage E — 26 is suddenly switched into the circuit, find 

the resultant current as a function of time. 

7 Work Exercise 6, given that the circuit elements are L — 0.16, R = 800, C — 10 -6 . 

8 A series circuit in which Q 0 = i 0 = 0 contains the elements L = 1, R — 1,000, 
C == 4 X 10~ 6 . A voltage E = 110 sin 50rrt is suddenly switched into the circuit. Find the 
resultant current as a function of time. 

9 A series circuit in which Q 0 = i 0 = 0 contains the elements L = 0.02, R = 250, 

C = 2 X 10~ 6 . A constant voltage E = 28 is suddenly switched into the circuit. Find the 

time it takes for the potential difference across the condenser to build up to one-half its 
final value, 

10 A condenser C = 4 X 10~ 6 , a resistance R — 250, and an inductance L — 1 are connected 
in parallel. A current source delivering a constant current I — 0.01 is suddenly connected 
across the common terminals of the elements. Find the resultant voltage as a function of 
time. 

11 a Prove that, if a set of elements with impedance Z\ is connected in series with a set of 
elements with impedance Z», then the impedance of the resultant combination is Z\ + Z 2 . 
b Prove that, if a set of elements with impedance Z\ is connected in parallel with a set of 
elements with impedance Zi, then the impedance Z of the resultant combination is given 
by the formula 



12 A constant voltage is suddenly switched into a nonoscillatory RLC circuit in which 

Qo = in = 0. Show that the potential difference across the condenser can never overshoot 
its final value. 

13 For what value(s) of « is the impedance sj ft 2 + [a>L — (1/«C)] J a minimum? Compare 
this with the corresponding property of the magnification ratio. Explain. 

14 If the frequency of the voltage Eo cos cat impressed on a series circuit is the same as the 
natural frequency of the circuit, show that the amplitudes of the steady-state potential 
differences across the inductance and the capacitance are each equal to EoR c /2R. 

16 Instead of using the ratio R/R c as a dimensionless parameter in circuit analysis, it is cus- 
tomary to use the so-called quality factor Q (not to be confused with the charge Q) defined 
as RJ2R. Express the impedance and the phase angle for a simple series circuit in terms 
of the resistance R, the frequency ratio and the quality factor Q. 


Systems with several degrees of freedom 

The laws of Newton and Kirchhoff, together with the theory of 
simultaneous linear differential equations developed in Chap. 3 
and the theory of difference equations developed in Sec. 4.5, form 
the basis for the analysis of large classes of systems with more 
than one degree of freedom. The details of such applications can 
best be made clear through examples. 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


Assuming friction to be negligible, find the natural frequencies of the mass-spring system shown 
in Fig. 5.13. 


A simple mass- 
spring system. 


As usual, we suppose the masses to be guided, by constraints which need not be specified, 
so that they can move only in the vertical direction. The instantaneous displacements of the 
masses from their equilibrium positions we shall use as coordinates to describe the system, dis- 
placements above the equilibrium positions being considered positive. Since friction is assumed 
to be negligible, the only forces acting on the masses besides the attraction of gravity are those 
transmitted to them by the attached springs. Moreover, as we saw in the derivation of Eq. (7), 
Sec. 5.2, the force of gravity can be neglected provided we also neglect the initial elongation 
of the springs and assume that each is unstretched when the system is in equilibrium. 

Now, when the displacements of the masses mi and ni« are yx and y«, respectively, the upper 
spring is changed in length by the amount yx and the lower spring is changed in length by the 
amount yx — y 2 . Because of these changes in length, the springs exert forces equal to 


respectively. Hence, applying Newton’s law to each mass and taking due account of the direc- 
tion of the forces applied to each mass by the attached springs, we have 


(1) (4D* + 12)^ - 4y, = 0 

— 4ju + (2D* + 4 )y t - 0 

From these equations we find that the equation satisfied by y t is 

| (4D* + 12) -4 I 

i yi = 0 

I -4 (2D* + 4)j 

or (D 4 + 5 D* + 4)2,! = 0 

The characteristic equation of this differential equation is m 4 + 5m* + 4 = 0, and its roots are 
m aB ±t, ±2 1 . Hence, 

( 2 ^ Vi — ti cos t + c* sin t + e» cos 2 1 + c< sin 2i 

Since the system (1) is homogeneous, it is evident that y- satisfies the same differential equation 
that yx satisfies. Therefore, 


l h — d\ cos t + di sin t + d t cos 2f + d< sin 2f 


SEC. 5.5 


SYSTEMS WITH SEVERAL DEGREES OF FREEDOM 


173 


Because we are concerned only with the frequencies at which the system can vibrate, there 
is no need to determine the relations which must exist between the c’s and the d’s. Whatever 
these relations may be, it is clear that Eqs. (2) and (3) represent periodic displacements at the 
frequencies ui = 1 and o) 2 = 2. Moreover, since Eqs. (2) and (3) (when their coefficients are 
suitably related) constitute a complete solution of the system (1) and, hence, a complete descrip- 
tion of the possible motion of the given physical system, it follows that free vibrations at fre- 
quencies other than to i and &> 2 are impossible. 


EXAMPLE 2 


In the circuit shown in Fig. 5.14, find the current in each loop as a function of time, given that 
all charges and currents are zero when the switch is closed at t — 0. 


FIGURE 5.14 
A simple two- 
loop electrical 
circuit. 


0.5 50 X 10 ~ 6 

henry farad 



300 

ohms 


We take as variables the currents ii and it flowing in the respective loops, noting that the 
current in the common branch is, therefore, ii — i 2 . Applying Kirehhoff's second law to each 
loop, we obtain the equations 

0.5 ~ + 200(i'i - it) = 50 
at 

300 ii + 200(ij - ii) + 5Q 'x io-o / dt = 0 
or, letting Qi = fit dt, 


di, dQi 

— + 400*i - 400 — 


100 


-2t, +5~ + 200Q 2 
at 


0 


The characteristic equation of this system is 
(m + 400) —400m I 

= 5(m* + 280m +• 16,000) - 0 

-2 (5m + 200) I 

From its roots, m x = — 80 and m 2 = —200, we can construct the expressions 
ii ~ aie~ m + bie~ im 
Qi — aie~ BOt + bte~ 2m 

which, after the constants on, a 2) bi, b 2 are properly related, will constitute the complementary 
function of the system. 

Substituting these expressions into the second of the two differential equations, we obtain 
— 2(ciie _S0( + b ie- 200t ) + 5(-80a 2 e~ 8M - 2005ge- 2 « M ) + 200(a 2 e' 801 + b 2 e~ im ) = 0 

This will be identically true if and only if ai — — 100a 2 and bi — —4005s. Therefore, the com- 
plementary function is 

ii « — 100a 2 e~ BM - 4Q06 2 e-*<> w 
Qi = ose" 80 ' + bie~ tm 


174 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


To find a particular integral, we assume h - ,4 t and Q« 
nonhomogeneous system of differential equations, we obtain 
400.4 1 = 100 
-24i + 200 A 2 = 0 

Hence, A t = y A and A 2 = Moo 

and, therefore, a complete solution of the system is 

it = — 100a 2 e~ 8O! - 4006 2 e~ 200( -f- M 
Qi = a«e~ m + b t e~ tm + Moo 
Since i\ — 0 and Q 2 = 0 when 1 = 0, we must have 
0 = — lOOas - 4006s + y 
0 5= . as + 62 + Moo 

From these we find without difficulty that a 2 = ~y 40 an d b 2 = 


= A Substituting these into the 


re, therefore, 


Moo- The required currents 


, 1 


- e -soo< 4. 


= dt 


Evidently i 2 = 0 when t — 0, as required. 


EXAMPLE 3 

Find the natural frequencies of the network shown in Fig. 5.15. 

By applying Kirchhoff’s second law to each loop in turn, we obtain the equations 
, dii 1 r 
L ~dt + CJ (l ' 1 

C / ~ *'») di + L + -q J (i- - i s ) dt = 0 

C / (u+1 ~ ik) dt + L + ~ j (ik+i - f* + s) dt m 0 

c / ^”“ l ~ dt + L + £ / (in— 1 - i„) dt = 0 

~ j (in- in-,) dt+L^ t+ l f i n dt = 0 



contain !v N " ^ netW ° rk - ^ in Fig ' 4 ‘ 2 ’ aIthou S h the «*work shown appears to 
fact tha Th " TT ^ 1116 nUmb6r ° f l0 ° PS iS aCtUaUy indefinite ‘ This is ^ the 

hi I r !lT ? POrtl ° n ° f the figure " drawn with bnes; this convention is used 

throughout the book to suggest a configuration of indefinite extent.) 


SEC, 5.5 


SYSTEMS WITH SEVERAL DEGREES OF FREEDOM 


175 


or, introducing new variables via the substitutions 

f . ,, n . dQ k di k d 2 Q k 

y*“ s -°‘ ,k ~-^~ DQk 

and, rearranging slightly, 

(LCD* + l)Qi - Q« = 0 
— Qi + (LCD* + 2)Q S - Q a = 0 

(4) -Q* + (LCD* + 2)Qk +I - Qk+s = 0 

-Qn - 2 + (LCD 2 + 2)Q„_! - Q n - 0 
-Q„-i + (LCD 2 + 2)Q n = 0 

Since there is no resistance anywhere in the network, it is evident that the response of the 
circuit to any set of nonzero initial conditions of charge and current will be purely oscillatory. 
Hence, we assume solutions of the form 

Qk — a* cos ut 

where u is the unknown frequency of the response and the a’s are arbitrary constants. Substitut- 
ing into the equations of the set (4), dividing each equation by — cos ut, and setting 

LCu 2 = a 2 

we obtain the algebraic equations 

—(1 — o; s )ai -f- as = 0 
Hi — (2 — a*)at -f- at — 0 

(5) at — (2 — a*)uk + 1 -f- flji;+8 = 0 


a n — » — (2 — <x 2 )a„-j a n — 0 
a n -i - (2 - a*)a n = 0 

In order for these equations to have a nontrivial solution, it is necessary that the deter- 
minant of their coefficients be zero. However, in this case the determinant of the coefficients is of 
the nth order, and to expand it and then solve the resulting nth-degree equation in a 2 = LCu 2 
would be prohibitively time-consuming. Hence it is much better to proceed in the following way : 
With the exception of the first and last equations, each equation of the system (5) is of the form 

a,k — (2 — a*)ak + 1 + 2 = 0 

In other words, for k — 1, 2, . . . , n — 2, the a’s satisfy the linear, constant-coefficient, 
second-order difference equation* 

(6) [E* - (2 - a*)E + l]a k = 0 

The first and last equations, which clearly do not fit into the pattern of Eq. (6), are, of course, 
the two boundary conditions necessary for the determination of the arbitrary constants which 
appear in the complete solution of this difference equation. 


* This is true, of course, only because the loops of the network, with the 
exception of the first and the last, are all identical. In general, the possibility 
of using difference equations should always be considered in studying sys- 
tems, both electrical and mechanical, which consist essentially of a number 
of identical components, identically connected. 



176 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


Following the theory of Sec. 4.5, the first step in the solution of Eq. (6) is to solve its 
characteristic equation 
(7) m 2 - (2 - «»)m + 1=0 


-tWO-t)’ 


The continuation now involves an investigation of various special cases depending on the possi- 
ble values of 1 — (a 2 /2). 

First of all, we can immediately reject the possibility that 1 — (a 2 / 2) S; 1, for this implies 
that a 1 g 0, which is impossible, since a* s LCw 2 is an intrinsically positive quantity. Moreover, 

if 1 — (or/2) => — 1, that is, if a 2 = 4 , then mi = m 2 = —1, and so, according to Table 4.2, 

Sec. 4.5, the complete solution of Eq. (6) is 
a* = (c, +c i fc)(-l)* 

Imposing the boundary conditions on a*, by substituting at into the first and last of the equa- 
tions (5), we have 

— ( — 3)[(ci + c*)(-l)J + [(c, + 2c 2 )( — l) 2 ] - — 2ci - c 2 - 0 
{[ft +(« - 1 )os]( — 1 )"~*} - (— 2)[(ci +nca)( — 1)"] * (-l)"[ci + (» + l)ft] = 0 

But these two equations obviously have only the trivial solution Ci = c» = 0. Hence, 1 — (« 2 /2) 
cannot equal — 1 . 

If 1 — (a i /2) < — 1, we can write 


so that the roots of the characteristic equation (7) become 

— cosh n ± y/ cosh 2 p — 1 = — cosh p ± sinh p = 


Hence, a complete solution of (6) can be written 

ay ~ Ci( —e») h + c 2 ( — e~») k = ( — l)*(dj cosh pk + d 2 sinh pk) 
where ch = c\ + c s and dt — c i — c 2 . Again imposing the boundary conditions on ay, we have 
—(1+2 cosh ju)(di cosh p + rf 2 sinh p) + (di cosh 2p + dt sinh 2p) = 0 
( —l) n ~ I ldi cosh (n — l)p + d t sinh (n — 1 )/u] + ( — l) n 2 cosh p(di cosh np + d? sinh np) — 0 
From these, by collecting terms and then simplifying through the use of the identities 
2 cosh 2 p = cosh 2*4 + 1 
2 sinh p cosh p = sinh 2p 


2 cosh np cosh p = cosh (n + l)*t + cosh ( n — l)*i 
2 sinh np cosh p ~ sinh (n + l)*t + sinh (n ~ I)p 


(1 + cosh p)di + (sinh p)d t = 0 
[cosh (n + l)p]di + [sinh (n + 1 )p]d 2 = 0 
These equations will have a nontrivial solution if and only if 
[ 1 + cosh p sinh p I ... 


I cosh (» + l)p sinh (n + l)p | 


= sinh (n + 1 )p + sinh np 



SEC. S.S 


SYSTEMS WITH SEVERAL DEGREES OF FREEDOM 


nr 


This can vanish only if p = 0, which is impossible, since, from (8), p = 0 implies 1 — (« 2 /2) = 
— 1, and this possibility has already been considered and rejected. Hence the assumption 
1 — (a' 2 /2) < — 1 also leads only to a trivial solution. 

It remains now to consider the possibility —X <1 — (a 2 / 2) < 1. To investigate this case, 
let us put 

(9) 1 — — = cos p 0 < p < it 


Then the roots of the characteristic equation (7) are 

cos p ± -\/ cos 2 p — 1 = cos p ± i sin p - e ± *> 
and a complete solution for a* is 

a* = ci cos kp + c 2 sin kp 

Again imposing the boundary conditions on a*, we have 

—(2 cos p — l)(ci cos p + c t sin p) + (ci cos 2 p + ca sin 2 p) — 0 

[ci cos (n — 1)^ + Ca sin ( n — l)/t] — 2 cos p(ci cos rip -f Cj sin np) = 0 

From these, by collecting terms and then simplifying through the use of the identities 
2 cos s p = 1 + cos 2p 
2 sin p cos p — sin 2 p 

2 cos np cos p = cos ( n + 1 )p + cos (» — X)ju 

2 sin np cos p — sin (n + 1 )m + sin (n — l)p 

we obtain (cos p — l)c! + (sin p)d » 0 

[cos (n + lVki + [sin (n + 1 )p]ct = 0 

These two equations will have a nontrivial solution for ci and Ca if and only if 

p 2n + 1 


cos p — 1 sin p 

cos (n + l)ju sin (n + l)/t 


= sin np — sin (n ~f- 1 )p = 


Now sin p/2 can be zero only if p is a multiple of 2ir, which is impossible in the present case, 
since 0 < p < rr. Hence, we must have 


Therefore, 

and 


N -0,1,2, . . 


The values N = 0, 1, . . . ,n — 1 lead to distinct values of p which in turn define the n natural 
frequencies of the network, since, from (9), 

s/ LC<*> % = a = s/2(\ — COS ju) = 2 sin ^ 

Hence the required frequencies are given by the formula 




178 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. S 


1 If M i = 1, M 2 - 2, fa = 1, fa = fa - 2 for the system shown in Fig. 5.16, find the natural 
frequencies of the system. 




2 If Mi = 1, M t - 2, fa - l, fa = fa = 3 for the system shown in Fig. 5.16, find the natural 
frequencies of the system. 

3 If Mi = Mt == 1, ki = 1, k 2 = 3, kt = 9 for the system shown in Fig. 5.16, find the natural 
frequencies of the system. 

4 Find the displacements of Mi and Mt as functions of t in Exercise 1, if the system starts 
from rest with a-j = 1 and xt ~ 0. 

6 In the system shown in Fig. 5.17 the parameters M u k u and o» are assumed to be known. 
Determine fa and M g so that in the steady-state forced motion of the system the mass M x 
will remain at rest. 



6 olT. * h “, t '° r T ‘ he POrametera *. can the two natural frequencies 

of the system shown in Fig. 5.16 be equal. 

7 ti moZnir 4 H f V,TK r nd W6ighing 16 lb/ft ^ SUpp0rted as sh ^ Fig. 5.18, on springs 

, ■ “ 5 respectively. If the springs are guided so that only vertical dis- 

placement of the center of the bar is possible, find the natural frequencies of the system. 


f = 48 in. 
w = 64 lb 


| Aj = 24 lb/in. §A 2 = 151b/in 

SFSffi 

(a) 



7~ ^ yi-24 sin 6 


rowL A n °ZlTT, th : y o' the center of the bar and the angie 

dkpi ~ s ° t- ^ - e « * 


SEC. 5.5 


SYSTEMS WITH SEVERAL DEGREES OF FREEDOM 


179 


8 In the network shown in Fig. 5.19 the current and the charge on the condenser in the closed 
loop are both zero, but the condenser in the open loop bears a charge Q 0 . Find the current 
in each loop as a function of time after the switch is closed. 


FIGURE 5.19 



9 Work Exercise 8 for the network shown in Fig. 5.20, 



10 Find the current in each loop of the network shown in Fig. 5.21 if the switch is closed at an 


instant when all charges and currents are zero. 

R bL 



12 In Example 3, find the normal modes, i.e., the sets of a’s for each of the natural frequencies. 

13 Work Example 3, with the condenser in series with the inductance in the last loop removed. 

14 Work Example 3, with the inductances and capacitances in each loop interchanged. 

16 Find the natural frequencies of the system of n equal masses connected by identical springs 
shown in Fig. 5.23. 



180 


MECHANICAL AND ELECTRICAL CIRCUITS 


CHAP. 5 


16 Find the natural frequencies of the system of n identical disks connected by identical 
lengths of elastic shafting shown in Fig. 5.24. 



17 Work Exercise 15, with the spring connecting the right-hand mass to the wall removed. 

18 Work Exercise 16, with the shaft connecting the right-hand disk to the wall removed. 

19 If a voltage E a cos ut is inserted in series with the inductance in the first loop of the network 
in Example 3, find expressions for the steady-state charges on the various condensers if 


a 0 < to 


'VLC 


b 


2 

<0 > Vlc 


20 If the capacitances and inductances in the network in Example 3 are interchanged and if a 
voltage E t) cos o>i is then inserted in series with the capacitance in the first loop, find expres- 
sions for the steady-state charges on the various condensers if 


Fourier Series 
and 

Integrals 


CHAPTER SIX 


Introduction 


FIGURE 6.1 
Typical periodic 
forcing functions. 


6.1 

In Chap. 2 we learned that nonhomogeneous, linear, constant- 
coefficient differential equations containing terms of the form 

A cos ut and B sin & >t 

could easily be solved for all values of «. Then in Chap. 5 we dis- 
covered that such differential equations were fundamental in the 
study of physical systems subjected to periodic disturbances. In 
many cases, however, the forces, torques, voltages, or currents 
which act on a system, although periodic, are by no means so 
simple as pure sine and cosine waves. For instance, the voltage 
impressed on an electrical circuit might consist of a series of pulses 
as shown in Fig. 6.1a, or the disturbing influence acting on a 
mechanical system might be a force of constant magnitude whose 
direction is periodically and instantaneously reversed, as in 
Fig. 6.16. 

This raises the question of whether or not a general periodic 


E(t) 

zl tl n n r 

t 

I F{t) 


182 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


function* can be expressed as a series of sine and cosine terms. 
Specifically, since, for all integral values of n, 


and 


. nirCt + 2 p) . nTt, 
sm — — 1 —J 2 . = sm — 
V V 


V V 

it is natural to ask whether an arbitrary function /(£) of period 
2 p can be represented by a series of the form 

_ nwt 
~V 
nrt 

V ' ' V ' V 


Oot , id . 2id . 

7 T + ai cos 1- a 2 cos b 

2 V V 

. , . id , . 2 irt . 
+ b i sin b 02 sm j- 


+ a n cos — + 


+ b n sin b 


If this is the case, then the methods of Chap. 5, applied to the 
individual terms of such a series, will enable us, in fact, to analyze 
the behavior of systems acted upon by general periodic disturb- 
ances. The possibility of such expansions and their determination 
when they exist are the subject matter of Fourier analysis , X to 
which we shall devote this chapter. 


6.2 

The Eofer coefficients 

To obtain formulas for the coefficients a„ and h n in the expansion 

m fn\ a o i iri , 2irt . . flirt , 

\*U = 7T I COS f- <l« COS (- * ' ‘ + dn COS — - + * * * 

2 p p p 

+ bi sin — + b% sin — +•••+&« sin -b • • • 

V V V 

assuming, of course, that it exists, we shall need the following 
definite integrals, which are valid for all values of d, provided m 
and n are integers satisfying the given restrictions : 


(2) 

(d+2p n-irt 

Jd p 

n 9* 0 

(3) 

fd+2p . nrt , 

Jd p 



* A function fit) is said to be periodic if there exists a constant 2p with the 
property that 

fit + 2 p) = fit ) for all t 

If 2 p is the smallest number for which this identity holds, it is called the 
period of the function. 

t The introduction of the factor H is a conventional device to render more 
symmetrical the final formulas for the coefficients. 

J Named for Joseph Fourier (1768-1830), French mathematician and con- 
fidant of Napoleon, who first undertook the systematic study of such expan- 
sions in a memorable monograph, “Thdorie analytique de la chaleur,” pub- 
lished in 1822. The use of such series in particular problems, however, dates 
from the time of Daniel Bernoulli (1700-1782), who used them to solve 
certain problems connected with vibrating strings. 


SEC. 6.2 


THE EULER COEFFICIENTS 


183 


(4) 

fd+ 2p 

Jd 

mirt nirt 

cos cos — dt = 

V V 

0 

m 9^ n 

(5) 


rd+ 2 P met ,, 

/ cos 2 — dt = 

Jd p 

V 

n 0 

(6) 

rd+ 2 P 

Jd 

niirt . nirt .. 

cos sin — dt — 

V V 

0 


(7) 

fd+2p 

Jd 

. mart . flirt ' , 

sm sin — dt = 

P V 

0 

to n 

(8) 


fd+ 2 P . , nirt j. 

/ sin- — dt = 

Jd p 

V 

n 7* 0 


With these integrals available, the determination of a n and b n 
proceeds as follows.* 

To find a 0 , we assume that the series (1) can legitimately be 
integrated term by term from t = d to t = d + 2p. f Then 

fd+2p .. . do fd+ 2p ^ , fd+2p wt , 

h MM-sU dt + a 'h oos-rf(+--- 

+ On ff +2? COS - “ (It + ' ' ' 
Jd p 

i i. fd+Zp . 7 rf ,, . 

+ bi sin — at + • • • 

Jd p 

fd+ 2 p . nrt j, , 

+ b*. / sin — at + • • • 
Jd p 

The integral on the left can always be evaluated, since /(f) is a 
known function which is assumed to be integrable. At worst, some 
method of approximate integration, such as those we discussed 
in Sec. 4.3, will be required. The first term on the right is simply 

1 \d+2p 

H ®of = Pda 
6 Id 

By Eq. (2), all integrals with a cosine in the integrand vanish, 
and, by Eq. (3), all integrals containing a sine vanish. Hence, the 
integrated result reduces to 
rd+ 2p 

J d f(t ) dt = pa 0 
or 

(9) a 0 = ~ J d d+2p f(t) dt 

To find a n ( n — 1, 2, 3, . . .), we multiply each side of (1) 
by cos nirt/ p and then integrate from d to d -f- 2p, assuming 

* The procedure here is analogous to the procedure we used in Sec. 4.6 to 
express an arbitrary polynomial as a linear combination of orthogonal poly- 
nomials. The most obvious difference is that in Sec. 4.6 the orthogonality 
condition involved the summation Of products of two different functions 
over a discrete set of values, but here the orthogonality condition involves 
the integration of products of two different functions, 
f A sufficient condition for a series of integrable functions to be term-by- 
term integrable is that it be uniformly convergent (Theorem 6, Sec. 15.1). 


184 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


again that term-by-term integration is justified. This gives 

fd+ - 2 p nirt J, 1 fd+ 2 p rnd ,, 

/ fit) cos — at = a o / cos — at 
Jd p Z Jd p 

fd+2p irt nirt J. . 

+ a% / cos — cos — at +■• • 

Jd p P 

. fd+2p flirt J. . 

-f a„ / cos- — at + • ■ • 
Jd P 

rd+ 2 p . irt flirt ,, . 

+ hi f sm — cos — at + • • • 

Jd P P 

fd+ 2 p . nirt nirt , . 

+ b„ / sm — cos — at + • • • 
Jd p p 

Again the integral on the left is completely determined. By Eqs. 
(2) and (4), all integrals on the right containing only cosine terms 
vanish except the one involving cos I 2 * * * * * * * mrt/p, which, by Eq. (5), is 
equal to p. Finally, by Eq. (6), every integral which contains a 
sine is zero. Hence, 

fd+ 2p . . flirt 
/ fit) cos — at — pa n 
Jd p 

or 

( 10 ) an ~pjd AO cos— dt 

To determine b n , we continue essentially the same procedure. 
We multiply (1) by sin mrt/p and then integrate from dtod + 2 p, 
getting 

fd+ 2 p . flirt, j 1 fd+2p . mrt 
I fit) sm — = sm — dt 

Jd p Z Jd p 

rd+ 2 p irt . mrt , . 

+ a i / cos — sm — — dt + • • ■ 

Jd V V 

. rd+ 2 p mrt . mrt , 

+ a„ / cos — sm — dt + • • • 
Jd p p 

rd+ 2 p . irt . mrt , . 

+ bi / sm — sm — dt + • • ■ 

Jd V V 

I L fd+2p , nirt J. . 

+ bn / Sin 2 — dt + • • • 

Jd p 

As before, every integral on the right vanishes but one, leaving 

r p m sin 2 ~ dt = pb n 

or 

(IX) L-ipB. in=?* 

Formulas (9), (10), and (11) are known as the Euler or 

Euler -Fourier formulas, and the series (1), when its coefficients 

have these values, is known as the Fourier series of f(t). In most 
applications, the interval over which the coefficients are computed 
is either ( —p,p) or (0,2p) ; so the value of d in the Euler formulas 


SEC. 6.2 


THE EULER COEFFICIENTS 


185 


is usually either — p or 0. Actually, the formula for a 0 need not 
be listed, for it can be obtained from the general expression for 
a„ by putting n — O.f It was to achieve this that we wrote the 
constant term as in the original expansion. 

We must be careful at this stage not to delude ourselves with 
the belief that we have proved that every periodic function f(t) 
has a Fourier expansion which converges to it. What our analysis 
has sho wn is merely that if a function /(i) has an expansion of the 
form (1) for which term-by-term integration is valid, then the 
coefficients in that series must be given by the Euler formulas. 
Questions concerning the convergence of Fourier series and, if 
they converge, the conditions under which they will represent the 
functions which generated them are many and difficult and are by 
no means completely answered yet. These problems are primarily 
of theoretical interest, however, for almost any conceivable prac- 
tical application is covered by the famous theorem of Dirichlet *4 

THEOREM 1 

If f(t ) is a bounded periodic function which in any one period has at most a finite 
number of local maxima and minima and a finite number of points of discon- 
tinuity, then the Fourier series of /(f) converges to /(f) at all points where /(f) is 
continuous and converges to the average of the right- and left-hand limits of f(t) 
at each point where /(f) is discontinuous. 

The conditions of Theorem 1, which are usually referred to 
as the Dirichlet conditions, make it clear that a function need not 
be continuous in order to possess a valid Fourier expansion. This 
means that a function may have a graph consisting of a number 
of disjointed arcs of different curves, each defined by a different 
formula, and still be representable by a Fourier series. In using 
the Euler formulas to find the coefficients in the expansion of 
such a function it will, therefore, be necessary to break up the 
range of integration (d, d + 2 p) to correspond to the various seg- 
ments of the function. Thus, in Fig. 6.2, the function /(f) is 


period. 



FIGURE 6.2 

A periodic func- 
tion defined by 
different formu- 
las over different 
portions of a 


f It is not necessarily the case, however, that the value of a 0 in a particular 
problem can be obtained by putting to = 0 in the integrated formula for a n . 
For instance, in Example 2, the integrated formula for a n is indeterminate 
when to = 0, and evaluation of the indeterminacy yields —3, instead of the 
correct value 3 which is obtained by putting n = 0 before integrating. 

X Named for the German mathematician Peter Gustave Lejeune Dirichlet 
(1805-1859). For a proof of this theorem see, for instance, H. S. Carslaw, 
“Fourier Series,” pp. 225-232, Dover Publications, Inc., New York, 1930. 


186 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


defined by three different expressions: fz(t), on suc- 

cessive portions of the period interval d t S d + 2p. Hence it 
is necessary to write the Euler formulas as 

1 rd+2v urt j 

a - = ph m ° 08 T 

1 fr , ... nirt lA . 1 fs . ... nirt . 1 fd+2p . . nirt 

= - j d Mi) cos — dt + - J r Mi) cos — dt + - jf /*(<) cos — dt 

, 1 /■d+2p . . flTt j. 

p Jd /w V 

1 fr , , J, . 1 /« , . flirt,. . 1 /-d+2j) , , Tlrf J , 

= - j d Mi) sm - -dt + -j r Mt) sm ~ dt + ~ J s /*(<) sm — di 

Incidentally, according to Theorem 1, the Fourier series of the 
function shown in Fig. 6.2 will converge to the average values, 
indicated by dots, at the discontinuities at d, r, and d + 2 p, 
regardless of the definition (or lack of definition) of the function 
at these points. 

■J 

EXAMPLE 1 

What is the Fourier expansion of the periodic function whose definition in one period is 


*>-{L 


-7T < « < 0 
0 < t’< V 


In this case the half period of the given function is p = tt. Hence, taking dt — — v in the 
Euler formulas, we have 

1 fir l fQ 1 fv 

a » ~ ~ y_ x /(0 COS ntdt = - 0 • cos ntdt -f - sin t cos nt dt 

if* l/cos(l —n)t cos (1 + n)<\l T 

i-» + i+» /Jo 

1 [” cos ( ir — nir) cos (-a- + nir) / 1 1 \1 

2ir [ 1 - n ^ 1 +«' \1 - re + r+n J J 

.i ( ~ cos TL7r , ~ cos nir 2 \ 

~ ~ 2^ \ 1 - « + 1 +71 “ 1 - n'V 


x(l - n 1 2 ) 

1 f* . 
ai m ; Jo sm 

b n == - f f(t ) sin n£ df = - f 0 • sin ni dt +- { sin t sin ni d£ 
v y-x -jr y-x tt yo 

i f 1 / sin (1 - n)i sin (1 + n)^ \> w 

v |_ 2 \ i — n 1+n/Jo 71 

bt =* - [ sin 2 1 dt = - f- ■ 

IT JO IT f 2 


. , sm 2 £ I*- 

sin £ cos t dt = — — = 0 

2jt 10 


i 2£> 


4 Jo 2 


SEC. 6.2 


THE EULER COEFFICIENTS 


18 7 


Hence, evaluating the coefficients for n — 0, 1, 2, ... , we have 

_ 1 sin 7 2 /cos 2f cos 4< cos 6< cos Si 

/(<) _ ; + T" ■ ; \~~r + ir + ~sT + + 

Plots showing the accuracy with which the first n terms of this series represent the given function 
are shown in Fig. 6.3 for n = 1, 2, 3. For n — 4, 5, . . . the graphs of the partial sums are 
almost indistinguishable from the graph of /(/). 





Interesting numerical series can often be obtained from Fourier series by evaluating them 
at specific points. For instance, if we set t — ir/2 in the above expansion, we find 



I-3 3-5 + 5-7 7-9 + '' 4 


. EXAMPLE 2 

Find the Fourier expansion of the periodic function whose definition in one period is 


m - 


-3 < t < 0 
0 < t < 3 


In this case the period of the function is 6. Hence p — 3, and, from (10) and (11), taking 



188 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


- 3, we have 


1 f <3 met , 1 /"3 

= - / —t cos — dt + - f i 
3 J-3 3 3 JO 


if 9 me t 3 1 . nxfl 0 , 1 f 9 nvt 3f nxfl 3 

3 [_nV 3 titt 3 J - 3 3.[nV 3 nv 3 J 0 


nV* 

1 ro 


(cos nx — 1) n 0 

1 P f 2 |0 


/O 1 /3 i* 0 p 8 a 3 

/-3-''“ + 5/o - el-3 + alo “5 + 2 

1 ro . «xf , , l /a ?ixf 
■3 i -3 ‘ "“T 1 + ! in * 31,1 T * 


if 9 nxf 3f «xf"|° , if 9 • K,r * 3f WTfil 3 

■ 3 [w ,m T ' » T j -3 + 3 [ ““ T ' » “ a T 


= — cos ( — ns-) — — cos nx — 0 
m r nx 

Substituting these coefficients into the series (1), we obtain 
fit) = 


3 12/l xf 1 3i! , 1 5xf \ 

5“ S T + 25“ S T + ■ ■ ) 


EXERCISES 

Determine the Fourier expansions of the periodic functions whose definitions in one period are 


1 

fit) = ] 

f 1 0 < f < x/2 

j 0 7 r/2 < f < 2a- 

2 

fit) = 

{; ■ 

-a- < t < 0 

0 < f < x 

3 

fit) = sin i/2 -x < f < x 

4 

/(f) - 

cos f 

— a-/2 < f < x/2 



f 2 0 < f < 2x/3 



(0 

— 3 < f < — : 

5 

fit) - | 

1 2x/3 < f < 4x/3 

6 

/(f) - 

< 1 + COS a-f - 1 < f < 1 



( 0 4r/3 < f < 2x 




1 < f < 3 

7 

fit) - i 

—a- < f < x 

8 

/(f) = 

<r‘ 0 

< f < 1 

9 

/(f) = a 

■ 2 - P — x < f < x 

10 

/(0 = 

t - f 3 

-1 < f < 1 



0 — 2 < f < — 1 



( 0 

-2 < f < -1 

11 

/«) - | 

1 1 -1 < f < 0 

-1 0 < t < 1 

12 

/(0 = 

J 1 +f 

j 1 - f 

-1 < f < 0 

0 <t <1 



0 1 < t < 2 



\ 0 

1 < f < 2 

13 

/(f) = | 

COS f — X < t < 0 

14 

/(f) = 

° 

-x < f < 0 

sin t 0 < f < x 

(p 

0 < f < X 

16 

Establish the following numerical results: 







1+ I + i+I + l + . 

2* 3 s • 4* 5* 

••== 

a- 2 
' 6 





!_1 ,1 I , 1 


X* 




j . 

2* 3 2 4* 5 s 


12 





1 +^+ii + 7 i + ^+ • 


¥ 




(Hint: Use the results of Exercise 14.) 


SEC. 6.3 


HALF-RANGE EXPANSIONS 


189 


6.3 

Half-range expansions 


When f(t) possesses certain symmetry properties, the coefficients 
in its Fourier expansion become especially simple. This was 
illustrated in Example 2 of the last section, where the given func- 
tion was symmetric in the y-axis and its expansion contained 
only cosine terms, i.e., only terms which themselves were sym- 
metric in the vertical axis. In this section we shall investigate in 
detail just what effect the symmetry of f(t) has on the coefficients 
in the Foui'ier series for f(t). 

Suppose first that f(t) is an even function; i.e., suppose that 
/(-*) - f(t) for all t 

or, geometrically, that the graph of f(t) is symmetric in the 
vertical axis. Taking d - — p in the formula for a n , Eq. (10), 
Sec. 6.2, we can write 


a = — f P f(t) cos — dt = — f f(t ) c 

p J-p p p J-p 


i — dt + — [ P f(t) cos — dt 
p p Jo p 


Now, in the integral from — p to 0, let us make the substitution 
t = —s dt = — ds 


Then, since t = —p implies s — p and t = 0 implies s = 0, the 
integral becomes 

(1) ^ f° /( s) cos (-ds) 

But /(— s) = f(s), from the hypothesis that f(t) is an even func- 
tion. Moreover, the cosine is also an even function; that is, 

— tlTTS Hits 

cos — = cos — 

V V 

Finally, the negative sign associated with ds in (1) can be elimi- 
nated by changing the limits back to the normal order, 0 to p. 
The integral (1) then becomes 

1 fv . nirs , 

- / /(s) cos ds 

p Jo p 


and thus a n can be written 



since the two integrals are identical, except for the dummy 
variable of integration, which is immaterial. 

Similarly, we can write 

b„ = — f° f(t ) sin — ^ dt + — [ P f(t) sin — dt 
p J-p p p Jo p 


% 



190 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


Again, putting f = — s and dt — —ds 
in the first integral, we find 

, 1 fO . . —UTS. ,, . 1 fv ,,, s . flirt , 

6 * = ? / P a-o sm — + p X ao sm y 

But, by hypothesis, /(— s) = f(s) 

, — nrs . nrs 

and sm = — sm — 

P V 

Hence, reversing the limits on the first integral, as before, we 
have 

b n - ~ - [ P f(s) sin — ds + - f* f(t) sin — dl - 0 
p Jo p p Jo p 

since, except for the irrelevant variable of integration, the two 
integrals are identical in all but sign. Thus we have established 
the following useful result: 

THEOREM 1 

If /(f) is an even periodic function, then the coefficients in the Fourier series of 
/(f) are given by the formulas 

a„ = - f P f(t) cos — dt b n = 0 
p Jo p 

Now suppose that /(f) is an odd function; i.e., suppose that 
/(— f) = —/(f) for all f 

or, geometrically, that the graph of /(f) is symmetric in the origin. 
Then proceeding just as before, we can write 

1 s/,\ nirt j. i 1 f p nr t . 

a n — - f(t) cos — dt + - / /(f) cos — dt 

p J-p p p Jo p 

” £ X°A-») cos yr (-■*>+;; i’ AO <-os y dt 

1 fp ,, . nrs , I rp .... nrt 

/ /(s) cos — ds -f - / /(f) cos — dt 

p Jo p ' p Jo j w p 

- 0 

6 * “ p /-, AO ™ y * + ~ jjm sin y dt 

= £ X°A-0 sin ^y 5 (-*) + i ///( 0 sin ^ * 

= H’™ siD T ds + U ,msi *T dl 

2 fv .. . . mrf 

-pjt A0«my<« 

Thus we have established the following result : 


SEC. 6.3 


HALF-RANGE EXPANSIONS 


191 


THEOREM 2 

If f(t) is an odd periodic function, then the coefficients in the Fourier series of f(t) 
are given by the formulas 

a n = 0 b n — — f P f(t) sin — dt 
p Jo V 

It should be emphasized here that oddness and evenness are 
not intrinsic properties of a graph, but depend upon its relation 
to the axes of the coordinate system. For instance, in Fig. 6.4, if 
the line AA’ is chosen as the vertical axis, the graph represents 
an even function whose Fourier expansion, in accordance with 
Theorem 1, will contain only cosine terms. On the other hand, if 
BB' is chosen as the vertical axis, the graph represents an odd 
function, and, by Theorem 2, only sine terms will appear in its 
expansion. Finally, if a general line, such as CC, is chosen as the 
vertical axis, the graph represents a function which is neither odd 
nor even, and both sines and cosines will appear in its Fourier 
series. 


FIGURE 6.4 
Plot showing how 
oddness and 
evenness depend 
on the choice of 
axes. 



A B C 


i 


A' B' C' 


The observations we have just made about the Fourier 
coefficients of odd and even functions serve to reduce by half the 
labor of expanding such functions. However, their chief value is 
that they allow us to meet the requirements of certain problems* 
in which expansions containing only cosine terms or expansions 
containing only sine terms must be constructed. 

Let us suppose that the conditions of a problem require us to 
consider the values of a function only in the interval from 0 to 
p. In other words, conditions of periodicity are irrelevant to the 
problem, and what the function may be outside the range (0,p) is 
completely immaterial. This being the case, we can define the 
function in any way we please over the interval (— p, 0) and then 
use the Euler formulas to determine the coefficients in the Fourier 
series of its periodic extension. Between — p and 0 this series will, 
of course, converge to whatever extension we created over this 
interval, but irrespective of this extension the series will represent 
the given function between 0 and p, as required. 

In particular, if we extend the function from 0 to ~p 


Examples of such problems will be found in Sec. 8.4. 



192 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


by reflecting it in the vertical axis, so that f(—t) — fit), the 
original function together with its extension is even; hence, 
the Fourier expansion of its periodic continuation will contain 
only cosine terms [including, of course, the constant term 
ao/2 = (a 0 /2) cos (Ort/p)] whose coefficients, as we showed above, 
will be given by 

2 fp ... mrt 

a n — - L fit ) cos — - dt 
p Jo p 

On the other hand, if we extend the function from 0 to — p by 
reflecting it in the origin, so that f(—t) = —fit), the extended 
function is odd, and hence the Fourier series of its periodic con- 
tinuation will contain only sine terms, whose coefficients will be 
given by 

, 2 rp .... . mrt 

bn — - / f(t) sm — dt 
p Jo J p 

Thus, simply by imagining the appropriate extension of a func- 
tion originally defined only for 0 < t < p, we can obtain expan- 
sions representing the function on this interval and containing 
only cosine terms or only sine terms, as we please. Such series are 
known as half -range expansions. 


EXAMPLE 1 

Find the half-range expansions of the function 
fit) = t-t* 0 < t < 1 

The half-range cosine expansion is obtained by first extending l — t s from the given interval 
(0,1) to the interval ( — 1,0) by reflection in the y-axis and then taking the function thus defined 
from —1 to 1 as one period of a periodic function of period 2p = 2. However, once we under- 
stand the reasoning underlying the procedure we need give no thought to the extension but can 
write immediately, on the basis of Theorem 1, 


2 ft flirt 

a "~ i Jo {i ~ ^ eos T dt 

_ f/ cos , * ■ \ ( 2< 

L\ nr. . J \nV 

_ g / cos nir — 1 2 cos nir\ 

\ n 2 ir 2 nV J 


t* M 1 

I sin nirt 1 I 

n-K /Jo 


2(1 + co s nir ) 
nV 


Hence it is possible to represent /(f) = t — l 2 for 0 < t < 1 by the series 

(2) f(t) = - _ — ( 003 2,rf , cos jxf cos 6irf cos 8irt 

6 IT 2 \ 4 16 + 36 + 64 + ' 


SEC. 6.3 


HALF-RANGE EXPANSIONS 


193 


Similarly, the half-range sine expansion is obtained by first extending the given function t — i 2 
to the interval ( —1,0) by reflection in the origin and then extending periodically the function 
thus defined over ( — 1,1). However, all we need to obtain the expansion is to note that, according 
to Theorem 2, 


2 /•! Wiri 

K ~lJo (‘-(’(sin — 
= 2 sin mrl — ■ 

4(1 — cos me) 


cos met 
2(cos me - 1) 


\ l^L • 

/ \n¥ Sm 


s )] 


-)I 


Hence it is also possible to represent f(t) for 0 < t <1 by the series 
(3) 


8 / sin irt sin Zeet sin 5wt sin 7irl \ 

= \ 1 + 27 + 125 + 343 + ' ' / 

Series (2) and (3) are by no means the only Fourier series that will represent t — l 2 on the 
interval (0,1), They are merely the most convenient or most useful ones. In fact, with every 
possible extension of t — t- from 0 to — 1 there is associated a series yielding t — i 2 for 0 < l < 1. 
For instance, a third such series might be obtained by letting the extension be simply the func- 
tion defined by t — t 2 itself for — 1 < t < 0. In this case 


. nrt , 

) cos — at 


L f l 

/ cos met l . \ / ( 

I + • — sm met I — I — 

y nV mr J \n‘ 


-)L 


nV 

in, , [> t 3 ! 1 

h = ^ ( t - t 2 ) sin ~ at 

r/ i . . t \ ( 2 

= | sm met cos met J — I — 

[_ \nV me J \m 


- cos met cos met 


■)L 


Hence, for 0 < t < 1 it is also possible to write 


1 4 / cos ne 

' 3 + ^ \~T~ 


2 / sin net 

A 1 


in 2wt sin 3or t sin inet 

2 + ~~3 i + ' 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


(a) 


FIGURE 6.5 
Plot showing dif- 
ferent- periodic 
functions coin- 
ciding over the 
interval (0,1). , . 




Figure 6.5a, b, and c shows the extended periodic functions represented, respectively, by the 
series (2), (3), and (4). 

Figure 6.5 and the associated expansions illustrate another 
interesting and important fact. In Fig. 6.5c the graph as a whole 
is not continuous but has jumps at t = ± 1, +3, ±5, . . . . In 
the corresponding series (4), the coefficients (of the sine terms) 
decrease only at a rate proportional to 1/w. On the other hand, 
the graph in Fig. 6.5a is everywhere continuous but has corners, 
or points where the tangent changes direction discontinuously. 
In the corresponding series (2) , the coefficients all become -small 
much more rapidly than in (4); in fact, they decrease at a rate 
proportional to l/w\ Finally, the graph in Fig. 6.56 not only is 
continuous but has a continuous tangent; i.e., there are no points 
where the tangent changes direction abruptly. This smoother 
behavior of the function is reflected in the coefficients in the cor- 
responding series (3), which in this case approach zero at a rate 
proportional to 1/n 3 . These observations are summed up and 
generalized in the following theorem, which we cite without 
proof:* 

THEOREM 3 

As n becomes infinite, the coefficients a n and 6„ in the Fourier expansion of a 
periodic function satisfying the Diriehlet conditions always approach zero at least 
as rapidly as c/n, where c is a constant independent of n. If the function has one 
or more points of discontinuity, then either a n or 6», and in general both, can 
decrease no faster than this. In general, if a function /(i) and its first k — 1 deriva- 
tives satisfy the Diriehlet conditions and are everywhere continuous, then as n 


* See, for instance, H. S. Carslaw, “Fourier Series,” pp. 269-271, Dover 
Publications, Inc., New York, 1930. 


SEC. 6.3 


HALF-RANGE EXPANSIONS 


195 


becomes infinite the coefficients a„ and b n in the Fourier series of f(t) tend to zero 
at least as rapidly as c/n h+1 . If, in addition, the &th derivative of f(t) is not every- 
where continuous, then either a n or 6„, and in general both, can tend to zero no 
faster than c/n k+1 . 

More concisely, though less accurately, this theorem asserts that 
the smoother the function, the faster its Fourier expansion 
converges. 

Closely associated with the last result are the following 
observations, which we also state without proof : 

THEOREM 4* 

The integral of any periodic function which satisfies the Dirichlet conditions can 
be found by term-by-term integration of the Fourier series of the function. 

THEOREM 5f 

If f(t) is a periodic function which satisfies the Dirichlet conditions and is every- 
where continuous and if f'(t) also satisfies the Dirichlet conditions, then wherever 
it exists, f'(t) can be found by term-by-term differentiation of the Fourier series 

of m. 


EXERCISES 

1 By considering the identity /(f) =* MU(t) + K —01 + — /( — 01» show that any 

function, defined for both positive and negative values of t, can be written as the sum of an 
even function and an odd function. 


Obtain the half-range sine and cosine expansions of each of the following functions: 


2 

4 

6 


m = 
m = 
m = 


1 0 < t < 1 

cos i 0 < t < 2ir 
ft 0 < t < 2 

| 6 - 2f 2 < t < 3 


3 

5 

7 


/(<) = e‘ 0 < t < 1 
f(t) = sin i 0 < t < 2r 



8 Obtain a series, different from the half-range sine expansion, which will represent t — t 2 
for 0 < t < 1 and whose coefficients will decrease as 1/n 3 . 

9 Is it possible to obtain a series representing t — t* for 0 < t < 1 whose coefficients will 
decrease as l/rd? 

10 Find a function whose half-range cosine series will have coefficients decreasing as 1/n 1 . 
Determine the expansion. 

11 How rapidly will the coefficients in the Fourier series of the periodic function 1/(2 + cos t) 
decrease? 


1 0 < t < a 

1 a < t < 1 and if a is only slightly less than 1, discuss the behav- 

0 1 < t < 2 

ior of the coefficients in the half-range cosine expansion of /(t) for small and medium values 
of n as well as for n —>■ « . 


* See, for instance, E. C. Titchmarsh, “Theory of Functions,” pp. 419-421, 
Oxford Book Company, Inc., New York, 1939. 

f See, for instance, E. T. Whittaker and G. N. Watson, “Modern Analysis,” 
pp. 168-169, The Macmillan Company, New York, 1943. 


12 If/«) = 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


13 If /(f), originally defined only for 0 < t < p, is extended from p to 2 p by reflection in the 
line t - p, show that the half-range sine expansion of the extended function contains no 
terms of the form 


and show that the coefficients of the terms of the form 


are given by the formula 


- L p m 3 

p jO 


2 p 


14 Determine how /(f), originally defined only for 0 < t < p, must be extended from p to 2 p 
in order that the half-range cosine expansion of the extended function will contain no terms 
of the form 

«T? 

cos — n even 
2 V 

Derive a formula for the nonzero coefficients. 

15 Prove Theorem 3 for the special case k ~ 1. Under what conditions can either a„ or b n 
decrease faster than e/n s ? [Hint: Assuming that /'(/) has a single point of discontinuity in 
each period, apply integration by parts, with /(<) = u, to the integrals defining a n and &„.] 


6.4 

Alternative forms of Fourier series 


The original form of the Fourier series of a function, as derived 
in Sec. 6.2, can be converted into several other trigonometric 
forms and into one in which imaginary exponentials appear 
instead of real trigonometric functions. For instance, in the series 


m 


do . V / nvt . t • 

= -K + > I On COS — + b n SI XI } 

2 «,\ v V } 


we can apply to each pair of terms of the same frequency the 
usual procedure for reducing the sum of a sine and a cosine of the 
same angle to a single term: 


m = f+| vW + bj (- 


b n 


mrt . 

= cos h - , - - 

,\/ a„ 2 + b n 2 P a/ aj + bj 


If we now define the angles y„ and 8 n from the triangle shown in 
Fig. 6.6 and set 


and An — v a n 2 + b n 2 


SEC. 6.4 


ALTERNATIVE FORMS OF FOURIER SERIES 


197 


FIGURE 6.6 
The triangle 
defining the 
phase angles y n 
and <5„ for the 
resultant of the 
terms of fre- 
quency jit/'p in 
a Fourier series 

the last series can be written 
fit) = rio + ^ A n ^cos cos y n + sin —• sin 7 „ 

= A 0 + ^ An cos ~ 7 

or, equally well, 

. . , V i / w irt . - . . art \ 

/(/.) =.,40+ l A» (cos — sin bn 4- sm — cos S H \ 

In either of these forms, the quantity A n — y/a n ’ ! + b„~ is 
the resultant amplitude of the components of frequency nx/p, 
that is, the amplitude of the nth harmonic in the expansion. The 
phase angles 

y n — tan -1 — and b„ — tan -1 = h ~ 7» 
a n bn 2 

measure the lag or lead of the nth harmonic with reference to a 
pure cosine or pure sine wave of the same frequency. 

The complex exponential form of a Fourier series is obtained 
by substituting the exponential equivalents of the cosine and sine 
terms into the original form of the series : 



m = f + 



gnirl p _|_ g—nivlf) 
2 


+ 



Collecting terms on the various exponentials and noting that 
1/i — —i, we obtain 


m ■■ 




— ibn 


irtlp _|_ 


If we now define 
<To a 

Co= 2 Cn = ~ 


- ibn 


a. n + ibn 


a n A- ib n 


198 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


the last series can be written in the more symmetric form 


Now, when it is used at all, this exponential form is used as a 
basic form in its otto right; i.e., it is not obtained by transforma- 
tion from the trigonometric form, but is constructed directly from 
the given function. To do this requires that expressions be avail- 
able for the direct evaluation of the coefficients c„. These may 
easily be found from the definitions of Co, c„, and For 




L fd+ 2j» nirt 

0 Jd * P 

fd+2p ... ( nirt 

h 

[j+ 2T f(t)er nMlr it 


.1 fd+ 2 p . . nirt , I 

'pi /w™ — *1 


» 1 r 1 fd+2p nirt ,, .1 fd+2p . nirt ,,1 

= 2 [p U m cos T m sm T J 

1 fd+ 2 P ( nirt . . . nirt\ 

"■Si it /W^COSy + !B m— jrfi 

1 /-d+2p . 


Clearly, whether the index n is positive, negative, or zero, c n is 
given by the one formula 

- r P f‘ + ” 

As usual, d will almost always be either —p or 0. 

In the complex representation defined by (1) and (2), a 
certain symmetry between the expressions for a function and for 
its Fourier coefficients is evident. In fact the expressions 




are of essentially the same structure, as the following correlation 
reveals: 

t~n 

f(t) ~ c n =5 c(n) 

gnivllp Q—nirtlp 


SEC. 6.4 


ALTERNATIVE FORMS OF FOURIER SERIES 


199 


This duality is worthy of note, and, as our development proceeds 
to the Fourier integral and the Laplace transform, it will become 
still more striking and fundamental. 


EXAMPLE 1 

Find the complex form of the Fourier series of the function whose definition in one period is 
/(f) = e~‘, -1 < t < 1. 

Since p = 1, we have from (2), taking d = —1, 


-If', 


1 [* e-(l+n i*)t ll 

6 dt 2 [ - (1 + mV) 

g-<l+B>ir) _ g(l+»«5T) 

= -2(1 + nil r) 

e • e nir — e ~ 1 • e~ niT 
2(1 + nix') 


Now e iv = cos 7T + i sin x — —1, and thus e ni * = e~ niv — ( — 1)". Hence 


_ ( — l) n e — (T 1 _ ( — 1)"(1 — nix) sinh 1 

C ” " (1 + nix) 2 “ 1 + »V 

The expansion of /(f) is therefore 


(1 — nix) sinh 1 


This, of course, can be converted into the real trigonometric form without difficulty, for 
we have, by definition, 


a r . + ib n 


and thus, by adding and subtracting, 
a n — c„ + c_„ b n 
Therefore in this problem 


i(c n - c_„) 


( — l) n (l —nix) sinh 1 ( — 1) B (1 + nix) sinh 1 ( — 1)” 2 sinh 1 

’ f + n*r* + 1 + nV T+ nV 


( — 1)”(1 — ni-t r) sinh 1 ( — 1)’ 

1 + nV " 


‘(1 + nix) sinh 1*1 (— l) n ! 

1 + ~~ J 1 - 


l Aa o == co = sinh 1 
Hence, we can also write 


. , _ „ . , / cos xt cos 2xt , cos 3irf \ 

= sinh 1-2 s nh 1 ( H • * • ) 

\l+ ir 2 1+ 4,r 2 ^ 1 + 9 t 2 / 

/sin xt 2 sin 2 jt/ 3 sin 3irf \ 

_ 2 , 8 mhl^__ TT i? + TT^ ) 


200 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


EXERCISES 

What is the amplitude of the resultant term of frequency mr/p in the Fourier series of the func- 
tions whose definitions in one period are the following? What is the phase of each of these terms 
relative to cos mrtfpl relative to sin mrt/pl 


1 fit) - 


0 


0 < t < 1 

1 <t <4 


s m 


1 0 < f < 1 

-1 1 <f <2 

0 2 < t < 4 


2 fit) 


t 

0 


4 fit) = 


1 

0 

-1 

0 


0 < t < 1 

1 < t < 2 

0 < t < 1 

1 < t < 2 

2 < / < 3 

3 < t < 4 


Find the complex form of the Fourier series of the periodic functions whose definitions in one 
period are: 


| 1 0 < f < 1 

= \ 0 1 < t < 2 

7 /«) - t 0 <i < 1 


6 fit) - 


6 fit) = j 
8 fit) - t 


9 fit) - cos « 
10 /(«) = sin i 


— jr/2 < f < ?r/2 (Hint : Use the fact that cos 5 « 

0 < f < ir 


0 < f < 1 

1 < t < 2 
-1 < t < 1 


6.5 

Applications Although we shall see other uses of Fourier series in later chapters, 
their most important application at the present stage of our work 
is in the analysis of the behavior of physical systems subjected to 
periodic disturbances. 

EXAMPLE i 

If the root -mean-square, or rms, value of a function f(t) over an interval (a,b) is defined as 


(1) 




express the rms value of a periodic function over one period in terms of the coefficients in its 
Fourier expansion. 

If f(t) is of period 2 p, we can write 


mrt 

+ On cos h 


+ 6i: 


IT t 


, , .nirt 
■+• b n sm h • 


Hence, f-(t) will consist exclusively of squared terms of the form 


Oo“ 


mrt 


-- On 2 COS* 

4 p 

and cross-product terms of the form 
mrt 

o 0 On cos — achn sm 

P 

imrt 


in 2 ! 


2a m a n c 


- cos 


SEC. 6.5 


APPLICATIONS 


201 


As in the original derivation of the Euler formulas in Sec. 6.2, the integral of every cross-product 
term, taken over one period of the function, is zero. Moreover, for the squared terms we have 

ao 2 [v , «o a » fp „ mri ' fp . me t 

— I dt = a n - / cos 2 — dt = ajp bj f sin di = b,rp 

4 J-p 2 J-v p J-p P 

Hence, dividing each of the nonzero terms by the length of the period, 2 p, we obtain for the 
required rms value 

(2) fit) | rms - +2 ^ («» 2 + &•*) 

Since the coefficients in the complex exponential form of the Fourier series of /(f) are related to 
the coefficients in the real trigonometric form by the equations 


Co 


«o 

2 


a„ — ib n 
2 


e n 


Eq. (2) can also be written 



In particular, if i = /(f) is an electric current flowing through a resistance B, the average 
power dissipated is 

+ l - ^ («»* + fc« 2 ) = (*o 2 + 2 ^ R 


EXAMPLE 2 

Determine the steady-state forced vibrations of the system shown in Fig. 6.7a if the applied 
force is as shown in Fig. 6.7 6. 

Our first step must be to obtain the Fourier expansion of the driving force. Since F(t) is 
clearly an odd function of t, no cosine terms can be present, and thus we need only compute b n : 

l (* 20 .tag* -SO 

M X A 

„ 4 0 

1 0 n even 

80 

— n odd 



FIGURE 6.7 

A spring-mass 
system acted 
upon by an 
alternating 
square-wave 
force. 


k = 100 lb/in. 
F(t)[ \ . 1 m — 96 lb 

= 0.05 


r 



(a) 


( b ) 


202 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


lienee v 

80/ sin.6irt , sin lOiri sin Uirt \ 

m - — (sin 2* +— +-r- + “ 7 - +■■■) 

Since we are concerned only with the steady-state forced motion of the system, we need deter- 
mine only the particular integral corresponding to F it ) . Since the equation is linear, this can 
be done very simply using the ideas of Sec. 5.3, for it is necessary only to apply the proper mag- 
nification ratio and phase shift to each component of the driving force and add the results. 
Preparatory to this, we must determine the static deflections that would be produced in the 
system by steady forces having the magnitudes of the various terms of Fit). These are given by 


(80/mr) lb 
100 lb/in. 


4 . 

- — m. 


n odd 


Then we must calculate the undamped natural frequency of the system: 


x Ikg 1 100 X 384 ^ , 

" - v ie " 20 rad/sec 


The rest of the work can best be presented in tabular form: 



Figure 6.8 shows the steady-state displacement plotted as a function of time. 

This example illustrates an exceedingly important but sometimes misunderstood charac- 
teristic of forced vibrations. If the driving force is not a pure sine or cosine function, its Fourier 
expansion will contain terms of frequencies above the fundamental, or apparent, frequency of 
the excitation. If the frequency of one of these terms happens to be close to the natural, or 
resonant, frequency of the system and if the amount of friction in the system is small, the corre- 
sponding magnification ratio will be large, and its value may offset many times the smaller 
amplitude of that harmonic and make the resultant term the dominant part of the entire 
response. If and when this happens, the response will appear to be of a higher frequency than 
the force which produces it. Figure 6.8 shows this clearly, for, although the force alternates 
only once per second, the weight is seen to move up and down three times per second. 

It is interesting to note that, although the driving force in this example is discontinuous, 
both the displacement and the velocity it produces are continuous. This is suggested by the 


f We must remember that here the subscript n in stands for natural and 
is in no way connected with the parameter n which identifies the general 
term in the Fourier expansion oiF(t). In the next section this will not be the 
case, but, taken in context, this dual use of the symbol u n should cause no 
confusion. 


SEC. 6.5 


APPLICATIONS 


203 



Displacement y 

A , 


FIGURE 6.8 

Plot showing a 
response of 
apparent fre- 
quency greater 

ft 

| 


than that of the 

excitation 

producing it. / 

r 

Ml 




U W,, 


plot of the displacement shown in Fig. 6.8 and confirmed by an application of Theorem 3, 
Sec. 6.3. In fact, since the frequency of the nth term in the Fourier expansion of the driving 
force F(t) is (2 n — 1)2tt ~ 4 m y it follows, neglecting all but the highest power of n, that, for n 
sufficiently large, the magnification ratio M is arbitrarily close to 

1 _ 1 _25_ 

a* /«.* ** (4ntr)V(20) 2 ~ 

Therefore, since the static deflection corresponding to the nth term in the expansion of F(t) is 
4 2 

(S,i )„ — S3 

5(2n — l)ir 5mr 

it follows that as n becomes infinite the coefficient of the nth term in the expansion of the steady- 
state displacement, namely, (5,i) n M, tends to zero as 10/V 3 n 3 . Thus, according to Theorem 3, 
Sec. 6.3, the displacement y(t) and the velocity y(t) are continuous, but the acceleration y(t) is 
discontinuous. 


EXAMPLE 3 

Find the steady-state current produced in the circuit shown in Fig. 6.9a by the voltage shown in 
Fig. 6.96. 

Our first step must be to find the Fourier expansion of the voltage. Since we plan to use 
the complex impedance, it will be convenient to use the complex exponential form of the Fourier 


FIGURE 6.9 
A series circuit 
driven by a 
square-wave 
voltage. 


250 0.02 2xl0 -< 

I WV ORW' If— 


<£> 



FOURIER SERIES AND INTEGRALS 


series. Hence we compute 


, _ J_ f 0 -' 

Ln 0.01 JO 


n even, ns* 0 
iE 0 

— n odd 

^ nix nx 

1 ro.005 „ , E e 

c / Eadl = ~r 

0.01 Jo 2 

Therefore, 

( jg-eooiTt fe-iooirt i { e iai>ixt .j e eooivt \ 

■■■ + ~jr + -7~ + -2 — 7 ai ) 

Now, in See. 5.4 we showed that the steady-state current produced by a voltage of the form 
Ae ial could be found simply by dividing the voltage by the complex impedance 


Z(o>) = R 

Using the data of the present problem, we have 
Z(o>) - 250 + . 

or, since w «= 200nr n odd 

we have Z(u) s Z„ — 250 + i 


f * ( tiiL - — - ) 

\ wC / 
lem, we have 

(•—£) 
3dd 


Hence, dividing each term in the expansion of the voltage E(t) by the value of Z for the corre- 
sponding frequency, we find 


where 


D« = 


/(f) - j D n e 2aani 
n iEo 


n oddf 


-iE a 


Z n nx 250 + t(4nx - 2,500/nr) 250 nx + i(4xV - 2,500) 

If we desire the real trigonometric form of this expansion, namely, 

/(f) — Ji>Uo + cli cos 200 rrf -f- a 3 cos 600ir4 + • • • + b\ sin 200 7rf -j~ 63 sin 600irf -j- 
we have at once 

a„ = Dn + £>_„ = -fEo 1 — - + 


-iEo 


25Qnx + i(4x-n 2 - 2,500) 
2E«(4r*n«' - 2,500) 

" (25 Onr) 2 + (4ir 2 n 2 - 2,500) 2 


b n = f(D tt - D- n ) = E, 

J_250rair + f(4r 8 ra 2 — 2,500) -! 


25Qnjr + z'(47r 2 n 2 - 2,500) J 
n odd 

1 


(250nr) 8 + (4 x 2 n 2 - 2,500) 2 


250 nx + i( 4x s 
odd 


2,500) _ 


t Because of the presence of the condenser, the impedance for the DC com- 
ponent, or component of zero frequency, is infinite. Hence the term }{Ea in 
the expansion of E(t) makes no contribution to the steady-state current. 


SEC 6.5 


APPLICATIONS 


205 


EXERCISES 

1 In Example 2, discuss the problem of determining the complete motion, transient as well 
as steady-state. 

2 In Example 3, why is the current I{t) continuous when the impressed voltage E(t) is dis- 
continuous? Is the charge Q(t ) continuous? 

3 In Example 2, determine the steady-state motion if the amount of friction is doubled and 
the spring is changed to one of modulus 120 lb/in. 

4 Determine the steady-state motion of the system shown in Fig. 6.10. 



FIGURE 6.11 





206 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


*? Determine the steady-state current in the circuit shown in Fig. 6.13. 


FIGURE 6.13 


600 4xl0" 6 1 



8 If f{t) - Ai sin (id - fi,) + At sin (2«t - 5 S ) + Az sin (3 <d -«*)+•••, show that 



9 If /i(i) - CnP n,>i/p and / 2 (<) = ^ d n e ni1rUp are two functions of period 2p, show 

that the average value of the product fi{t)ft(t) over one period is ^ c„d_„. 

10 If f(t) is the periodic function whose definition in one period is 


f(i) - \t\ ~t < t < 7T 

find the solution of each of the following equations satisfying the indicated conditions: 

a y" - y » /(«) 2/0 = Vo = o b y" + 3y' + 2.?/ = /«) 2/o = !)[ = 0 

c ?/" + « /«) ?/o = y'c = 0 d y" + 9?/ = fit) 2/o = 2 /' 0 = 0 


Harmonic analysis 

From time to time in applied work it is necessary to construct the 
Fourier expansion of a function defined by a table of values 
instead of by an analytic expression. Various methods have been 
devised for doing this, comprising collectively the field of har- 
monic analysis.* Of these, the simplest and most obvious is the 
evaluation of the integrals in the Euler formulas by means of the 
trapezoidal rule. Since this method is quite satisfactory for the 
occasional applications confronting the average worker and need 
be improved upon only for the person who must handle numer- 
ical functions regularly, it is the only method we shall discuss. 

Because so many problems in harmonic analysis involve 
functions, such as meteorological or economic quantities, whose 
period is either a day or a year, it is customary to assume that 


* For a more detailed discussion of harmonic analysis see, for instance, E. T. 
Whittaker and G. Robinson, “The Calculus of Observations,” pp. 260-283, 
Blackie & Son, Ltd., Glasgow, 1937. 



2.0 2 > 7 / 


SEC. 6.6 


HARMONIC ANALYSIS 


207 


(1) 


( 2 ) 


data are available at intervals of K 2 , or sometimes 34 s of a 
period. Accordingly, we shall consider a function fit) of period 2p 
for which values are available at intervals of 


M » ?P = JL 

M 24 12 


Now any function f(t) can be expressed as the sum of an even 
function and an odd function simply by writing 


m = 


m 

2 


. m -A-t) 
~ l_ 2 


= git) + hit), say, 


since g(t) is clearly even and h(t) is clearly odd. Hence the cosine 
terms in the expansion of f(t) are just the terms in the half- range 
cosine expansion of g(t), and the sine terms in the expansion of 
f(t) are just the terms in the half-range sine expansion of h(t). 
For the even function g(t) we have, as usual, 

2 fv ... mrt 
a n = - / q{t) cos — at 
p Jo p 

or, applying the trapezoidal rule, with At — p/12, 

2 T V /Vo nt _ . nr p . 

a » = p[SV2 cos 7 ' 0 + plCOS 7T2 + • • • 

, nr lip . gu nrl2p\l 

+ g* cos 7 1# + 2 cos 7 if jj 

1 Too . nr , , llnr . g 12 1 

"6[l +9lCOS l2 + ' ' ' +Su«®- I 2-+%-° os “ r J 

The cosine factors in the last expression can be evaluated once 
and for all and combined with the other numerical factors, includ- 
ing the ^2 in the definition of g(t), to yield a set of weights by 
which the successive values of the sum f(t) + /(—£) are to be 
multiplied before they are added. The weights involved in the 
calculation of the first ten a’s are given in Table 6.1. 

Similarly, for h(t) we have 

bn — ~ f P h(t ) sin — dt 
p Jo w p 

2 f p Ao . nr n . , . nr p . 

“?[l 2 (, 2 Sln 7 0 + ; “ Sm 7il + ' ' ' 

. , . nr lip , hii . nrl2p\l 

+ 4nsm __| + _ sm __|jj 

1 f , . nr . , , . llnr! 

= g^,sm i3 + ■ • • j 

The weights required for the evaluation of this expression are 
shown in Table 6.2. 




210 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


EXAMPLE 1 

Find si for the periodic function whose definition in one period is 


yo = -2.0000 

2/5 

= -0.9236 

yio = 

-0.1944 

2/u = 0.1875 

2/20 — 0.2222 

Vl = -1,7569 

. 3/8 

= -0,7500 

Vn = 

-0.0903 

2/u « 0.2222 

2/21 == 0.1875 

2/ 2 = -1.5278 

Vr 

- -0.5903 

Vn = 

0.0000 

2/u — 0.2431 

= 0.1389 

S/a = —1.3125 

2/s 

- -0.4444 

2/13 = 

0.0764 

2/« = 0.2500 

t/ 23 = 0.0764 

2/4 = -1.1111 

3/9 

= -0.3125 

2/u = 

0.1389 

2/u = 0.2431 

2/ai = 0.0000 


Interpreting the mid-ordinate yn to be /o in our general discussion and using the weights 
in the column headed n = 1 in Table 6.1, we find without difficulty 


Term 

Weight 

Product 

0.0000 = 0.0000 

0.08333 

0.00000 

0.0764 - 0.0903 = -0.0139 

0.08049 

-0.00112 

0.1389 - 0.1944 = -0.0555 

0.07217 

-0.00401 

0.1S75 - 0.3125 « -0.1250 

0.05893 

-0.00737 

0.2222 - 0.4444 = -0.2222 

0.04167 

-0.00926 

0.2431 - 0.5903 = -0.3472 

0.02157 

-0.00749 

0.2500 - 0.7500 = -0.5000 

0.00000 

0.00000 

0.2431 - 0.9236 = -0.6805 

-0.02157 

0.01469 

0.2222 - 1.1111 « -0.S8S9 

-0.04167 

0.03704 

0.1875 - 1.3125 = -1.1250 

-0.05893 

0.06630 

0.1389 - 1.5278 = -1.3889 

-0.07217 

0.10024 

0.0764 - 1.7569 = -1.6805 

-0.08049 

0.13526 

0.0000 - 2.0000 - -2.0000 

-0.04167 

0.08334 


a 

= 0.40762 


In this simple illustration, y is actually the function t — f s , — 1 £ f <| 1, whose expansion 
we obtained in Sec. 6.3. The exact value of <m, as read from Eq. (4), Sec. 6.3, is 4/V 2 = 0.4053; 
so our approximation is in error by about of 1 per cent. 


EXERCISES 

1 Determine by harmonic analysis the Fourier expansion of the circular arc y — y/ 2irx — x 2 
(0 g x g 2 r) through a & and be. 

2 The following table gives the cylinder pressure in pounds per square inch for a certain 
4-cycle internal combustion engine at 30® increments of the crank angle 0: 


e 

1 P 

0 

P 

e 

0 

200 

150 

45 

300 

30 

350 

180 

20 

330 

60 

167 

210 

6 

360 

90 

102 

240 

0 

390 

120 

65 

270 

0 

420 


P 

9 

P 

0 

P 

0 

450 

0 

600 

5 

0 

480 

0 

630 

11 

0 

510 

0 

660 

30 

0 

540 

0 

690 

90 

0 

570 

0 

720 

200 


Determine the harmonic analysis of the pressure through a 6 and be. 

3 The normal maximum and minimum temperatures at New York City on the first and 
fifteenth of each month are given in the following table: 


SEC. 6.7 


THE FOURIER INTEGRAL AS THE LIMIT OF A FOURIER SERIES 


211 




Max. 

Min. 


Max. 

Min. 


Max. 

Min. 

Jan. 

1 

38 

26 

May 1 

63 

48 

Sept. 1 

77 

64 

15 

37 

24 

15 

68 

52 

15 

74 

60 

Feb. 

1 

37 

24 

June 1 

73 

57 

Oct. 1 

69 

55 


15 

38 

24 

15 

77 

60 

15 

64 

49 

Mar. 

1 

41 

26 

July 1 

SO 

64 

Nov. 1 

57 

43 


15 

45 

30 

15 

82 

66 

15 

51 

37 

Apr. 

1 

51 

36 

Aug. 1 

82 

67 

Dec. 1 

45 

32 

15 

57 

42 

15 

80 

67 

15 

41 

29 


Neglecting the slight irregularities in the spaeing of the data, determine the harmonic 
analysis of the maximum temperature and of the minimum temperature. 

4 By evaluating the complex form of the Fourier series of f(t) at the points t = 0, ir/m, 2 ir/m, 
. . . , (2m — l)jr/wi and using the fact that e iir — —1, show that 

+/ (=) 

= 2 m(- ■ • + C— 5m + C— 3m + C_m + C m + C 3m + Cbm + 1 * *) 
= 2 m (a m -f- a^m + fl. r .m + • • •) 

Explain how this formula can be used to determine approximately the coefficients of the 
cosine terms in the Fourier expansion of /(<). 

5 By evaluating the complex form of the Fourier series of /(<) at the points l = ir/2vi, 3jr/2t», 
5 jt/2 m, . . . , (4m — l)ir/2m and using the fact that e ,v ' 2 = i, show that 

■'(sM=M=) — 

— 2 mi(- • • — C— 5m + C_3m — C_m + C m — Cz m + 

= 2 m(h m — m + bs m — • • •) 

Explain how this formula can be used to determine approximately the coefficients of the 
sine terms in the Fourier expansion of /(<), 


6.7 

The Fourier Integra! as the Simit of a Fourier series 

The properties of Fourier series we have thus far developed are 
adequate to accomplish the expansion of any periodic function 
satisfying the Dirichlet conditions and, in connection with the 
theory of Chap. 5, enable us to find the response of numerous 
mechanical and electrical systems to general periodic disturb- 
ances. On the other hand, in many problems the impressed force 
or voltage is nonperiodic rather than periodic, a single unrepeated 
pulse, for instance. Functions of this sort cannot be handled 
directly through the use of Fourier series, for such series neces- 
sarily define only periodic functions. However, by investigating 
the limit (if any) which is approached by a Fourier series as the 
period of the given function becomes infinite, a suitable repre- 
sentation for nonperiodic functions can perhaps be obtained. An 


212 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


FIGURE 4.14 
A periodic 
function of 



: f 


example is probably the best way to introduce the theory of this 
procedure. 

Consider, then, the function f p (t) shown in Fig. 6.14 in the 
limit as p — » °c , as suggested by Fig. 6.15. This function is clearly 
even, and thus its Fourier expansion contains only cosine terms; 
i.e., 

/i \ e /i\ ho I Trt 2irl , . rt . 

(1) fp(t) = + GlCOS — ~b 02 cos b ’ ■ ' + cos b • ' • 

& p p p 

where 

(2a) a 0 = - Pi'*-* 

p Jo p 

and 

2 ci nirt 2 sin (mrt/p) |i 2 sin (mr/p) 

p Jo p p mr/p lo p mr/p 

It will help us now to understand what happens as p — » « if 



SEC. 6.7 


THE FOURIER INTEGRAL AS THE LIMIT OF A FOURIER SERIES 


213 


(3) 


(4) 


we plot the “spectrum” of f p (t) ; that is, plot a n as a function of the 
frequency 

w n = — rad/unit time 

V 

for different values of p. Introducing the symbol «„ in Eq. (2b), 
we then have 

_ 2 sin co« 

V Wn 

where successive values of n correspond to values of co„ which 
differ by the constant amount 

( n + l)ir nv _ x 

~P P ~ V 

Hence the values of a n are simply the ordinates of the curve 
^ _ 2 sin co _ 2 sin w , 

at successive co-intervals of ir/p, beginning at co = 0. These are 
shown in Fig. 6.16 for p = 2, 4, and 8. 

As suggested by Fig. 6.16 and confirmed by Eq. (4), the 


Aco = - 


- Aco 


FIGURE 6.16 
Plot illustrating 
the behavior of 
the Fourier 
coefficients of a 
function as the 
period of the 
function becomes 
infinite. 


N 

> = D 

\ 

\(n = 2) (n = 4) 


(n = 6) (n - 8)^ 


i 



(n = 7) 

p 



(n = 5) {n = 6) (» = 7) 


<»- 1 ) 
1 2 




(»- 8 ) 


214 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 







coefficient plots, or spectra, for different values of p differ in only 
two respects: 

a In the vertical seale, which is inversely proportional to p (or 
directly proportional to A &>) 

b In the horizontal interval between ordinates, which is also 
inversely proportional to p (or equal to A <a) 

The fact that, -as p—* °° (or Aw — » 0), the frequencies of the terms 
in (1) become more and more closely spaced and the coefficients 
approach zero suggests that this series, thought of as a function 
of p , is actually a sum of infinitesimals whose limit is an integral. 
Indeed, this is true of Fourier series in general, as the following 
outline of steps reveals. 

If we begin with the complex exponential form of a Fourier 
series [Eqs. (1) and (2), Sec, 6.4] 

U(t) - £ 

Cn = f* p me- ni * th dt E ^ f* p f P (s)e~™i* ds 
and substitute the second expression for c n into /„(£), we obtain 

!M- | 

~ n t _ [k *] e " Mh \ 

Now, as above, let us denote the frequency of the general term by 
nir 

w„ = — 

V 


and the difference in frequency between successive terms by 



SEC, 6.7 


THE FOURIER INTEGRAL AS THE LIMIT OF A FOURIER SERIES 


215 


of the form (7) as Aco — » 0 is the integral 

J_ m F( u>) du> 

Hence, since p — » °° implies Aw — > 0, it follows that there is good 
reason to believe that as p — » °o the nonperiodic limit of f p (t) can 
be written as the integral 

(8) /(f) = e fa< £ /(«)«- *" ds j 

Though our derivation of it has been far from complete,* the 
last result is actually a valid representation of the nonperiodic 
limit function /(f), provided that, in every finite interval, /(f) 
satisfies the Dirichlet conditions and that the improper integral 

/_*. 1/(01 dt 

exists. Under these conditions, the so-called Fourier integral (8) 
gives the value of /(f) at all points where /(f) is continuous and 
gives the average of the right- and left-hand limits of /(f) at all 
points where /(f) is discontinuous. 

The Fourier integral can be written in various forms. For 
instance, we can write Eq. (8) as 

(9) /(f) = /_ M g{u>)e iut da) 
where 

( 10 ) 

These two expressions, in which the symmetry between /(f) and 
its coefficient function g(a>) is unmistakable, constitute what is 
known as a Fourier transform pair.f The coefficient function g(<a) 
is, of course, completely equivalent to /(f), since when it is known, 
/(f) is determined through Eq. (9). In effect, we thus have two 
different representations of the function of our discussion: /(f) in 
the time domain and g(o>) in the frequency domain. In passing, 
we note that elaborate tables of Fourier transform pairs have 
been prepared for engineering use.J 


* The situation is actually not so simple as we have made it appear, for from 
(6) it is clear that the structure of the function F(o>) depends on p as well as 
upon a). Hence, as p increases, the function we are evaluating changes, and 
the elementary theory of the definite integral is not strictly applicable. 
Moreover, the fact that the summation extends over an infinite range makes 
additional investigation of the limiting process necessary. The modifications 
required for a rigorous justification of our conclusions can be found in more 
advanced texts, such as R. V. Churchill, “Fourier Series and Boundary 
Value Problems,” 2d ed., pp. 113-117, McGraw-Hill Book Company, New 
York, 1963. 

t Sometimes it is more convenient to associate the factor l/2w with the inte- 
gral for /(£) instead of with the integral for g(a). It is also possible to achieve 
a still more symmetric form by associating the factor l/y/2m with each of 
the integrals. 

t G. A. Campbell and R. M. Foster, “Fourier Integrals for Practical Applica- 
tions,” D. Van Nostrand Company, Inc., Princeton, N.J., 1948. 


216 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


If we choose, we can, of course, move the factor e iat into the 
integrand of the inner integral in (8), since it does not involve the 
variable s of that integration. This gives 

(H) m = ^ ds dw 

In this, we can replace the exponential by its trigonometric equiv- 
alent, getting 

f(t) - ~ J* I** /(s)[cos co(s — i) — i sin co(s — t)] ds da 
If we break this up into two integrals, we get 


m 


/*» /”« cos w ( s ~ 0 ds da — ^ /(«) sin co(s — t) ds da 


(12a) 


(126) 


m 


Now sin co(s — t) is an odd function of co. Hence the second 
integral vanishes because of the co-integration from — °° to » . 
This could have been foreseen, of course, since by hypothesis 
f(t) is purely real. Thus we obtain the real trigonometric 
representation 

/(£) = ^ f(s) cos co(s — t) ds da 

Since the integrand of (12a) is an even function of co, we need 
perform the co-integration only between 0 and » , provided we 
multiply the result by 2. This gives us the modified form 


m 


- ^ jf * f(s) cos co(s — t) ds da 


If f(t) is either an odd function or an even function, further 
simplifications are possible. To see this, we first expand the factor 
cos a(s — t) in the integrand of (12a), getting 

= 2t I- « I- co cos “ s cos ds dca +• /(s) sin cos sin cot ds do 

and then write the inner integrals as the sums of integrals over 
(— «,0) and (0,oo). Then 

f(t) = ^ f a f(s ) cos cos cos at ds dco 

+ J_ K J Q f{s) cos cos cos cot ds do) 

+ ^ /(«)■ sin cos sin cot ds da 

+ ^ j_ J m sin cos sin cot ds dco 


SEC. 6.7 


(13) 


(14) 


(15) 


(14a) 


(15a) 


THE FOURIER INTEGRAL AS THE LIMIT OF A FOURIER SERIES 217 


Next we make the substitution s = —z, ds = —dz in the integrals 
from — «> to 0 : 

/(f) — ~ f(—z) cos (— az) cos at (— dz ) do) 

+ ^ f(s) cos as cos at ds da 

+ ^ j°° J° f(~z) sin (~az) sin at {—dz) da 

+ kl'» L' m sin as sin at ds da 

Now, if f(t) is an even function, so that /(— z) - f(z), the 
first integral in (13) becomes identical with the second when the 
minus sign attached to dz is used to reverse the order of the limits 
on the inner integral. Similarly, the third and fourth integrals 
turn out to be negatives of each other. Hence we have simply 

/(f) — - J" f(s) cos as cos at ds da /(f) even 

This is called the Fourier cosine integral and is analogous to the 
half-range cosine expansion of a periodic function which is even. 

If /(f) is an odd function, so that /(— z) — —f(z), then the 
first and second integrals in (13) cancel each other, and the third 
and fourth combine, giving 

/(f) =1 J”* j* f( s ) sin as sin at ds da f(t) odd 

This is the Fourier sine integral and is the analogue of the half- 
range sine expansion of an odd periodic function. 

For some purposes it is convenient to have the Fourier 
cosine and sine integral representations displayed as transform 
pairs. Thus we can write (14) in the form 

m = »(«> cos at da 

1 /(f) even 

g(a) = - J q f(s) cos as ds 

and (15) in the form 

/(f) = j** g(a) sin at da 

1 fit) odd 

9 («) = - J Q f( s ) sin as ds 

Equations (14), (14a), (15), and (15a) can, of course, all be modi- 
fied by performing the ^-integrations only from Oto » and multi- 
plying the results by 2. 

To illustrate the Fourier integral representation of a non- 
periodic function, let us return to the isolated pulse which we 
considered briefly at the beginning of this section (Fig. 6.15). 


218 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


(16) 


Since this function is clearly even, we can use (14), getting 


S cot 


sm cos 

L w 

cos cot sin 


fit) — ~ J* 1 • cos cas cos cot dsdco = i J_ a cos 

4 /: 

_ 2 /•« cos cot sin co 

it Jo O) 


dco 


where the last step follows because the integrand is an even func- 
tion of w. Thus, although it is impossible to find an elementary 
antiderivative for the last integral, we know that, as a definite 
integral, it must equal 1 if £ is between —1 and + 1, must equal 
% if t — ±1, and must vanish if t is numerically greater than 1. 

In the ease of the Fourier-series representation of a periodic 
function it was a matter of some interest to determine how well 
the first, few terms of the expansion represented the function 
(Fig. 6.3). The corresponding problem in the nonperiodic case is 
to investigate how well the Fourier integral represents a function 
when only the components in the lower part of the (continuous) 
frequency range are taken into account. Suppose, therefore, that 
we consider only the frequencies below w 0 . In this case, from (16) 
we have, as an approximation to /(£), the finite integral 


2 rm cos cot sin co £ 
t Jo co w 

Now cos a sin & = 


sin (a -j- b) — sin (a — 6) 
2 


and thus we can write the last integral as 
1 r«o sin co(t -f- 1) , _ 1 rm sin co(t — 1) 

ir Jo to C W t Jo co 


dco 


In the first of these terms let co(t + 1) — u, and in the second let 
co(t — 1) = u. Then, for our approximation to /(£), we have 


1 rm(t+i) sin u 
7T Jo u 


1 /-woft-i) sin u 

T JO U 


Although integrals of this form cannot be expressed in terms 
of elementary functions, they occur often enough in applied 
mathematics to have been named and tabulated. Specifically, 

Si (x) as f x ^du 
Jo u 

is known as the sine integral function of x and is tabulated in 
numerous handbooks.* Using this notation, the approximation to 
f(t) can be written 

- Si coo(£ + 1) — ^ Si <u o(£ — 1) 


* See, for instance, E. Jahnke, F. Emde, and F. Losch, "Tables of Higher 
Functions,” 6th ed., McGraw-Hill Book Company, New York, 1960. 


SEC. 6.7 


THE FOURIER INTEGRAL AS THE LIMIT OF A FOURIER SERIES 


219 



FIGURE 6.17 
Plot showing the 
approximation of 
a function by its 
Fourier integral 
taken only over 
frequencies less 
than too. 



^ji 

Figure 6.17 shows this approximation for o>q = 4, 8, and 16 rad/ 
unit time. Physically speaking, these curves describe the output 
of an ideal low-pass filter, cutting off all frequencies above w () , 
when the input signal is an isolated rectangular pulse. 

The Fourier integral representation of a nonperiodic function 
can be used in essentially the same way as the Fourier series 
representation of a periodic function in applications like those we 
considered in Sec. 6.5. For instance, if an electrical circuit is 
acted upon by a nonperiodic voltage f(t) whose Fourier integral 
representation is [Eqs. (9) and (10)] 

f(t) = f“ m g{a)e ial da where g(a) = /(*)«“*• ds 

we can still, for purposes of analysis, think of fit) as being the 
sum of an infinite number of complex voltages e iat . In fact, the 
only distinction between the periodic and nonperiodic cases is 
that in the latter the “spectrum” of fit) contains terms of all 
frequencies and the amplitude, or intensity, of the component of 
any given frequency a is infinitesimal, namely, 

g(a) da 

Now in Sec. 5.4 we saw that the current produced in a circuit of 



220 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


impedance Z(a) by a complex voltage E 0 e iu,t is simply 
E$e tal 

Wi 

Hence, to find the current produced by the nonperiodic voltage 
f(t), we need only divide the infinitesimal voltage 
[,<?(“) du>]e iat 

corresponding to the general frequency « by the value of the 
impedance Z(u) at that frequency and then “add,” i.e., integrate, 
all the infinitesimal currents thus obtained. The result is simply 

/(,) = f-.zti 5 e “‘ rf " where s „ 

and Z(ca) is the impedance of the circuit. 

A similar discussion of mechanical systems acted upon by 
nonperiodic forces, using the magnification ratio and phase shift 
[Eqs. (10) and (11), Sec. 5.3] instead of the complex impedance, 
can be based on Eq. (12a). 

exercises 

Make an amplitude-frequency plot for p — 2, 4, and 8 for the periodic function whose 
definition in one period is 


h f-J w<r “ 


ds 


0 


m - - 


m - 


-p<t <- 1 
-1 < t < 0 

0 < t < 1 

1 < t < p 

~p < t < - 1 
-i < t < i 
i < t < p 


)i+t 

i 


0 

' sin irt 
.0 


— p < t < - 
-1 < t < 

0 <t< 

1 < t < 

-p <t < - 
-1 < t < 

1 <t< 


Make an amplitude-frequency plot for p = 
definition in one period is 


2, 4, and 8 for the periodic function whose 


m - 


-p < t < 0 
0 < t < p 


-p <t < 0 
0 < t < p 

3 Find the Fourier integral representation of each of the following functions: 




m 

{r 

(" 


b fit) - 


m = 


m 


™-{r 


t < 0 
t > 0 

> 7T J 

- « < t < 0 
0<t < I 

1 <t< =0 *• 

Find the Fourier integral representation of the function 
/ 0 -co CtC -1 

m - | “1 


f m - 


1 - t* 
0 


t C 0 
t > 0 

t 5 C 7T a /4 
t 2 > x 2 / 4 

t 2 C 1 
t 2 > 1 


SEC. 6.7 


THE FOURIER INTEGRAL AS THE LIMIT OF A FOURIER SERIES 


221 


and express the integral which approximates this function for frequencies between 0 and 
an) in terms of sine integral functions. 

5 Find the Fourier integral representation of the function 


/(f) - 



- 00 < t < -1 
-1 < f < 0 

0 < t < 1 

1 < t < M 


and express the integral which approximates this function for frequencies between 0 and uo 
in terms of sine integral functions. 

6 Find the Fourier integral representation of the function 


■u 


f 2 < 1 
f 2 > 1 


- da> is a particular integral of the equa- 


2 /•■ 

‘;/o 


> (2 — « 2 ) cos ait + 3o> sir 
(2 - «*j* + 9w 2 


tion y" + 3 y' + 2y = /(f) 
where /(f) = I * 


f 2 < 1 
f 2 > 1 


8 Find a particular integral of the equation y" + ay' + by — f{i) 


where 




f 2 < 1 
l 2 > 1 


9 Find a particular integral of the equation y" + ay' + by = /(f) where /(f) is the function 
described in Exercise 5. 

10 Find a particular integral of the equation y" + ay' 4- by = /(f) where /(f) is the function 
described in Exercise 6. 

11 a Using the Fourier integral representation (12a) and the concepts of magnification ratio 
and phase angle, obtain a formula for the response of a mechanical system of one degree 
of freedom to a nonperiodic driving force. 

b Show that the response of an electrical circuit of impedance 2^ (w) to a nonperiodic voltage 


/(f) 


‘ /_ 00 g ^ ei “‘ < 


can be written 



where <R{g{ai)/Z(ai)} and A \g(ai)/Z(a>) 1 denote, respectively, the real and the imaginary 
part of the complex expression g{ai)/Z(at). 

12 If we call g(ai) in Eq. (10) the Fourier transform of /(f), show that, if the various trans- 
forms exist, then 

a The Fourier transform of e ±i “°'/(f) is g(ai + coo), 
b The Fourier transform of /'(f) is iaig(ai). 
c The Fourier transform of J f(t) dt is g(ai)/m. 
d The Fourier transform of /i(f)/»(f) is 

J_ ea gi(u)g i (w ~ u) du = gi(ai - v)g 2 (v) dv 

where gi(«) and gi{u) are, respectively, the Fourier transforms of f t (t) and / 2 (f). 

13 Let /(f) be a function which is identically zero outside the interval (—1,1), so that the 


222 


FOURIER SERIES AND INTEGRALS 


CHAP. 6 


Fourier transform of /(f) is 

T(f) - ~ fcmr-* 


By repeated differentiation of *P( 1), show that 


T( P) 


(*)• 


where S(<a) = (sin m)/«. Explain how this result can be used to obtain the Fourier trans- 
form of a single pulse defined between —1 and 1 by a convergent power series.* 

14 Using the definitions of Exercise 13, show that 


T(e»‘-«) = - S(« - tur) 

IT 

Explain how this result can be used to obtain the Fourier transform of a single pulse 
defined between —1 and 1 by a Fourier series in either complex exponential or real trigono- 
metric form. 

15 If /(f) is a pulse defined between —1 and 1 by either of the equivalent series 

^ (a„ cos nirt + b n sin mrt) ^ c„e ni * 1 
n= 0 »*«-« 

use the results of Exercise 14 to show that 
a n = »[<#>(njr) + <t>(—nir)] 
b n — «r[#(rair) — </>(— n*r)] 

C n — K<t>(,nir) 

where <t>(u) is the Fourier transform of the pulse. 


6.8 

Prom the Fourier Integra! to the Laplace transform 

In many applications of the Fourier integral the function to be 
represented is identically zero before some instant, usually t = 0. 
When this is the case, the general Fourier transform pair, given 
by Eqs. (9) and (10), Sec. 6.7, becomes the unilateral Fourier 
transform pairf 

(1) fit) = ~ g(u>)e iat dw g{u) = Jj f(s)e~ ia ‘ ds 

Useful as this is in many applications, it is still inadequate to 


* In particular problems, the book “Tables of the Function sm M and of Its 

u 


First Eleven Derivatives,” Harvard University Press, Cambridge, Mass., 
will be of considerable help. 



SEC. 6.8 


FROM THE FOURIER INTEGRAL TO THE LAPLACE TRANSFORM 


223 


I 

FIGURE 6.18 j 

The unit step 
function u(t). 

,u(t ) 





represent such a simple function as the so-called unit step func- 
tion u(t) (Fig. 6.18): 


u(t) = 


0 

1 


i < 0 
t > 0 


In fact, for this function 


- /o' 1 


e~ iaa |» _ cos <as — i sin us [ « 
— lo — to jo 


and this is completely meaningless, since both the cosine and sine 
oscillate without limit as their arguments become infinite. 

As an artifice to handle this case and others like it, the func- 
tion e~ at is sometimes inserted in place of the unit step function. 
Now as we shall soon see, e~ at has a unilateral Fourier transform 
when a is positive. Moreover, when a approaches zero, e~ at , con- 
sidered for t > 0, approaches the unit step function (Fig. 6.19). 
Hence, it is natural to hope that the order of the operations of 
letting a approach zero and taking the Fourier transform can be 
interchanged. If this is the case, then we can postpone letting 
a — » 0 until after the transform has been taken, and all will be well. 

In the present problem the development proceeds as follows. 
Instead of transforming u(t) we transform e~ at , getting 

f 00 0— («+»«)* U l 

g(oj) — / e~ as e~ ias ds — — : — r-r — j — — 

Jo — (a-l-iw)io a + to 

since the factor e~ as now ensures that the antiderivative vanishes 



FIGURE 6.19 

Plot showing how e~ at ( a,t > 0) approaches the unit step function when 


1 

_ J ~ 

a = 0 

a approaches zero. 


224 


FOURIER SERIES AND INTEGRALS 


CHAP, 6 


m - 


at the upper limit. Thus, 

m - 5; {', s(u)e'"‘ da 
1 /■- e 1 "* . 

= KT / ■ UW 

2 t ./-» a + 2w 

_ 1 /•« cos &)£ + i sin at a — ia ^ 

2t J— x ci + ia a — ia 
1 /•« (a cos at + a sin cot) + i(a sin cot — w cos at) , 

= 2 tJ-. d “ 


Now, the imaginary part of the integrand, namely, 


a sin cot — a cos at 
a 2 + w 2 

is an odd function of « and, hence, will vanish when integrated 
between the limits — go and °o . On the other hand, the real part 
of the integrand is an even function of a, and thus we can write 


° a cos at + a sin at 


' 1 r« a cos at , . 1 f* a sin at , 

da — - -3-1 — 5 da -j- - I ~r ~, — 5 aw 

ir Jo a 2 + a 2 tt Jo a- + a 2 


In the first integral in the right member, let a = az. Then 


,, A 1 f” cosate , , 1 r« a s 

m -;Jo + sr 


t, da 


l + z 2 t Jo a 2 c 
We are now in a position to let a approach zero. As this happens, 


f(t) m e~ at -» u(t ) 
and thus we obtain 


u(t) • 


t , 1 . 1 r » sin at , 

- da = h -+■ - I — aw 

2 it Jo a 


This establishes the value of another definite integral, without 
benefit of an antiderivative. 

The use we have just made of the so-called convergence 
factor e~ at is both artificial and clumsy, and it would be desirable 
to make this procedure more systematic. To do this, let us define 


F(t) 


0 i<0 

e~ at f(t) t > 0 


where f(t) is the function of actual interest. Then, applying the 
unilateral Fourier transformation to F(t), which surely satisfies 
the necessary conditions if f(t) does, we have, for t > 0, 

F(t) — e~ at f {t) — ~ J_" g(w)e iot da 

gW) = j™ F(s)e~ iaa ds = j” e~ a i(s)e~ im ds 

= j* /(s)e _(a+! “ )s ds 


where 


SEC. 6.8 


FROM THE FOURIER INTEGRAL TO THE LAPLACE TRANSFORM 


225 


We can now multiply both sides of the expression for F(t) by 
e at , getting 

f(t) ~ ^ g(u>)e iat do) = ^ J" m g(u)e< a+ia)t du 

Moreover, from the last form of the expression for g(w) it is clear 
that w enters the analysis only through the binomial a + i<c. To 
emphasize this fact, we shall write g(a + m) instead of g(a). Then 
the equations of the transform pair become 

fit) - ^ g(a + io))e< a + ia)i du> 

gin + ie>) - J* /( s )e- <0+i " )s ds 

Finally, let us put a + noting that 

du — d(<r -f idf) _ da 
i i 

and that when w — — w , a = a — i*> and when u — <x>, 
a = a + i oe . Then we have the pair of equations 

fit) - g—- f**'" gio)e al da g{a) ~ J o f(s)e~ cs ds| 

These constitute a Laplace transform pair, f The function gia) is 
known as the Laplace transform of f(t). The integral for /(/,) is 
known as the complex inversion integral. 

We have thus naturally and inevitably encountered the 
Laplace transformation through our attempt to provide the' uni- 
lateral Fourier transformation with a *f built-in” convergence 
factor. This transformation is the foundation of the modern form 
of the operational calculus, which was originated in quite another 
form by the English electrical engineer Oliver Heaviside around 
1890. In the next chapter we shall develop an extensive list of 
formulas for the use of the Laplace transform itself, although the 
meaning and use of the inversion integral* we must leave to the 
chapters on complex-variable theory. 


f Most writers use t rather than s as the dummy variable in this integral, 
and use s rather than <r in both integrals. In the next chapter we shall follow 
this convention. 

J Named for Pierre Simon de Laplace (1749-1827), who used such transforms 
in his researches in the theory of probability. 


CHAPTER SEVEN 


The Laplace 
Transformation 


7.1 

Theoretical preliminaries 

In the last chapter we traced the evolution of the Laplace trans- 
formation from the unilateral Fourier integral. Our development 
made it clear that, for the Laplace transform of f(t) to exist and 
for /(/) to be recoverable from its transform, it is sufficient that 
a In every interval of the form 0 ^ ti ^ t £ t s , f(t) be bounded 
and have at most a finite number of maxima and minima and a 
finite number of finite discontinuities, 
b There exist a real constant a such that the improper integral 

j Q |e -fl ‘/(0| dt = e~ “‘1/(01 dt is convergent. 

Functions satisfying condition a we shall henceforth describe 
as piecewise regular. 

Condition b is frequently replaced by the stronger, i.e., more 
restrictive, condition that 

b' There exist constants a, M, and T such that 
e~ at \f(t)\ < M for all f > T 

Functions which satisfy condition b' are usually described as 
being of exponential order. 

Obviously, if e~ ai \f(t)\ < M, then e~ a ^\f(t)\ < M for all 
aj > a. Thus the a required by condition b' is not unique. The 
greatest lower bound a 0 of the set of all a’s which can be used in 
condition b' is often called the abscissa of convergence of /(f). 
Under this definition, it is evident that the abscissa of conver- 
gence ar 0 may not itself be one of the a’s which will serve in condi- 
tion b'. For instance, if /(f) = i, then, for every positive a and no 
others, 

e-°“\f(t)\ = ie~ at 


SEC. 7.1 


THEORETICAL PRELIMINARIES 


227 


remains bounded and in fact approaches zero as t becomes infinite. 
Obviously the greatest lower bound of the set of all positive 
numbers is the number 0. Hence, in this ease a Q = 0, even though 
for ao itself 
== t 

increases beyond all bounds as t — » <» . In passing, we note that 
the abscissa of convergence of a function may be negative. For 
example, for /(f) = e~ 2t the abscissa of convergence is —2, since 
as t — » oo 

e~ at \f(t)\ ss e~ ai e~ 2t 

is bounded for all values of a equal to or greater than —2 but for 
none less than —2. 

Since e~ at \f(t)\ < If implies that \f(t)\ < Me al , it is clear 
that, if a function is of exponential order, its absolute value need 
not remain bounded as t— * °o, but it must not increase more 
rapidly than some constant multiple of a simple exponential 
function of t. As the particular function /(f) = sin of shows, the 
derivative of a function of exponential order is not necessarily of 
exponential order. On the other hand, it is not difficult to show 
that if f(i) is piecewise regular and of exponential order , then 
Jq f(t) dt is also of exponential order. 

With a function /(f) satisfying either conditions a and b or 
conditions a and b', the Laplace transformation associates a 
function of s, which we shall denote by £{/(f))f or simply by 
£{/}. This is defined by the formula 

( 1 ) £(/(()) - J‘f(t)er«dtt 

The function /(f) whose Laplace transform is a given function of s, 
say <f>(s), we shall call the inverse of 4>(s) and shall denote by 
the symbol £~ l [<j>(s)}. From the concluding discussion of the 
last chapter we have good reason to believe that the function 
having $(s) for its transform is given by the complex inversion 
integral 

(2) AO = 53 ££‘* ( * ) *'‘* 

where s is the complex variable a + *w, but we shall make no use 
of this fact in the present chapter. Indeed, in this chapter we shall 
regard s as a real-valued parameter. 

It is obvious that the derivation of the fundamental proper- 
ties of the Laplace transform will involve manipulation of the 

t Many writers consistently use only small letters to denote functions of t and 
use the corresponding capital letters to denote the transforms of these func- 
tions. Thus what we shall write as £{/(<) } is often written as F(s). 
t Clearly, the variable of integration t is a dummy variable and can he 
replaced at pleasure by any other symbol. From time to time we shall find 
it convenient to do this in our work. 


228 


THE IAPIACE TRANSFORMATION 


CHAP. 7 


definitive integral (1). This integral is clearly improper, since its 
upper limit is infinite, and it may also be improper because of 
discontinuities of f(t) at one or more points in the range of 
integration. However, inasmuch as /(f) is assumed to be piecewise 
regular, these discontinuities can be at worst finite jumps which 
can easily be handled by breaking up the range of integration 
into subranges whose end points are the points of discontinuity, 
We shall, therefore, usually not pay explicit attention to the 
possible jumps of /(f). Questions associated with the infinite 
upper limit in (1) are more serious, however, and cannot be 
passed over so lightly. 

At the outset, we recall that by an integral of the form 
(3) f* h(s,t ) dt 

fb 

we mean lim / h(s,t) dt 

b-*« Ja 

and that for this limit to exist for a particular value of s, say 
s = si, it must be possible to show that, for any e > 0 there 
exists a number B such that 

J J a h(si,t) dt — j* h(s u t) dt J s | J b h(si,i) dt J < e 

for all values of b > B. The number B will, of course, depend on 
e and in general will also depend on Si, the particular value of s 
under consideration. It may happen, however, that one and the 
same number B will serve uniformly, or equally well, for all 
members of some set of s- values. If and only if this is the case, 
the integral (3) is said to converge uniformly, or to have the 
property of uniform convergence, over that particular set of 
s-values. 

The importance of uniform convergence is apparent from 
the following theorems, which we shall have to use in this chapter 
but whose proofs we leave to more advanced texts.* 

THEOREM 1 

If g(s,t) is a continuous function of s and if for a sJ s <; f3 and t ^ a, if fit) is at 
least piecewise regular for i j> a, and if the integral G(s) = Jj f(t)g(s,t) dt con- 
verges uniformly over the interval a ^ s (3, then (?(s) is a continuous function 
of s for a *£ s ^ /3. 

Since the definitive property of a continuous function is that 
lim G(s) ~ G(s 0 ) 

this theorem states, in effect, that under the appropriate condi- 
tions the limit of G(s) can be found by taking the limit inside the 
integral sign. 


* See, for instance, H. S. Carslaw, “Fourier Series,” pp. 198-201, Dover 
Publications, Inc., New York, 1930. 



SEC. 7.1 


THEORETICAL PRELIMINARIES 


229 


THEOREM 2 

If g(s,t) is a continuous function of s and t for a 5S s ^ 0 and t 2s a, if /(f) is at 
least piecewise regular for t ;> a, and if the integral G(s) = f(i)g(s,t) dt con- 
verges uniformly over the interval a S s ^ |3, then 

G(s) ds = f" j a f(t)g(s,t) dtds — f(t)g(s,t) dsdt 

In words, this theorem states that under the appropriate condi- 
tions the integral of G(s ) can be found by integrating inside the 
integral sign. 


THEOREM 3 

If g(s,t) and g B (s,t) = are continuous functions of s and t for a. g s 0 and 

t <£ a, if /(f) is at least piecewise regular for t ^ a, if the integral 
G(s) = j a f(t)g(s,t) dt 

converges, and if J a f(t)g s (s,t ) dt converges uniformly over the interval a I s ^ ft 
then 

0\s) m ~ f* f(t)g(s,t) dt = f” f(t)g 8 (s,t) dt 

for all values of s such that a ^ s ^ 1 3. 

In words, Theorem 3 states that, under the appropriate condi- 
tions, the derivative of G(s) can be found by differentiating inside 
the integral sign. 

Obviously, if we take g(s,t) to be the continuous function 
e~ si and take a — 0, the integral G(s) referred to in the last three 
theorems is precisely the Laplace transform of the function /(f).. 
However, before we can apply these theorems to our work we 
must determine under what conditions the Laplace transform 
integral converges uniformly. We begin by proving the following 
weaker result: 


THEOREM 4 

If fit) is piecewise regular and of exponential order, then 

W(0} = f 0 * f(t)e~ at dt 

converges absolutely for any value of s greater than the abscissa of convergence 
ao of fit). 

PROOF To establish this theorem, we must show that 
(4) lim \f(t)e~ at \ dt = lim f Q b |/(f)|e~ s( dt 

exists, and to do this it is necessary that we have an upper bound for 1/(0 i for 
t ^ 0. Now, by hypothesis, /(f) is of exponential order and, therefore, has an 
abscissa of convergence a 0 . Hence, there exist numbers Mi and T such that for 


230 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


all t > T and any a greater than but bounded from ao, that is, any a such that 
a > ai > a 0 , we have 

1/(0 1 < Ml** 

Moreover, since /(£) is piecewise regular, it is bounded over the finite interval 
0 £ t S T; that is, there exists a positive number M 2 such that 
1/(0 i < M% = (M 2 e~ al )e al for 0 g t S T 
Thus if we let M be the largest of the three numbers M i, M 2 , M 2 e~ aT ,'\ it is clear 
that 

1/(01 <Me at for all 

Hence, returning to the integral in (4) and replacing f(t) by its upper bound, we 
have 

rb rb I 6 M 

1=1 |/(i)|fi-‘ dt % f Me at e~* 1 dt = ^ r = (1 - 

Jo UWI Jo —(a — a) jo s — a 

Now if s > a, the last expression increases monotonieally and approaches 
M/(s — a ) as b becomes infinite. Therefore, 
r ' if ^ ^ 

I ^ S > a > ao 

S — a 

Since the integrand of I is everywhere nonnegative, it is clear that I is a mono- 
tonieally increasing function of 6. Hence, being bounded above, as we have just 
shown, it must approach a limit as b becomes infinite. Since s > a > a 0 
is clearly equivalent to the condition s > ao, the theorem is established. 

Since the absolute value of an integral is always equal to or 
less than the integral of the absolute value, it follows from the 
preceding discussion that 

I fjmer-dt | s / 0 ‘ - I S 

Hence, letting 6— > °o, we have the important result: 


THEOREM 5 

If f{t) is piecewise regular and of exponential order with abscissa of convergence 
ao, then, for all values of s and a such that s > a > ao, 

M 

l £ {/(£)) I ^ where M is independent of s 

Finally, from Theorem 5 we draw the following interesting 
conclusions: 

COROLLARY 1 

If /(£) is piecewise regular and of exponential order, then £{/(£)} approaches zero 
as s becomes infinite. 


t Both Mi and Mie~ aT must be considered, because if a > 0 then Mi is the 
maximum of Mie~ at on 0 £ t <, T, whereas if a < 0 then M 2 e~ aT is the 
maximum of M 2 e"“‘ on 0 g t g T. 



SEC. 7.1 


THEORETICAL PRELIMINARIES 


231 


COROLLARY 2 

If f(t) is piecewise regular and of exponential order, then s£ {/(£)} is bounded as s 
becomes infinite. 0 

These corollaries make it clear that not all functions of s are 
Laplace transforms — or at least not Laplace transforms of func- 
tions of the “respectable” class defined by conditions a and b'. 
For instance, $(s) = s/(s — 1) does not approach zero as s 
becomes infinite; hence it is not the Laplace transform of any 
“respectable” function. Also, although cj>(s) = 1/Vs does 
approach zero as s becomes infinite, it is not the transform of any 
“respectable” function, since s<j>(s) = y/s is not bounded as s 
becomes infinite. 

We are now in a position to establish the uniform conver- 
gence of the integral defining £{/(0} : 


THEOREM 6 

If /(f) is piecewise regular and of exponential order with abscissa of convergence 
ao, then, for any number s 0 > ao, 

£{/(*)} = f* me~ st dt 

converges uniformly for all values of s such that s ^ Sq. 

PROOF To prove this theorem, we must show that, given any e > 0, there 
exists a number B, depending on e but not on s, such that 

| J b f(t)e~ 3t dt J < e for all b > B and all s ^ so 
Now | J b f(t)e- st dt [ <; j b \f(t)\e~ at dt 

and we know that for s > ao the integral on the right approaches zero as b becomes 
infinite, since this is implied by the fact that 

f* W)\<r«dt 

is convergent for s > a 0 (Theorem 4). In other words, given any e > 0 and any 
s 0 > a 0 , there exists a number B such that 

f b " |/( 0 dt < e for all 6 > B 

Now if s ^ s 0 , it is obvious that e~“ £ Hence, 

f~ \f(t)\e-*‘dt ^ f” |/( 0 | e-^dt 

and so for any s 5: s 0 the integral on the left is less than e for all values of b greater 
than the particular B which suffices for the integral on the right. This value of B is 
clearly independent of s, and so the proof of the theorem is complete. 

In succeeding sections we shall find that many relatively 
complicated operations upon '/(f), such as differentiation and 


232 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


integration, for instance, can be replaced by simple algebraic 
operations such as multiplication or division by s, upon the trans- 
form of f(t). This is analogous to the way in which such operations 
as multiplication and division of numbers are replaced by the 
simpler processes of addition and subtraction when we work not 
with the numbers themselves, but with their logarithms. Our 
primary purpose in this chapter is to develop rules of transforma- 
tion and tables of transforms which can be used, like tables of 
logarithms, to facilitate the manipulation of functions and by 
means of which we can recover the proper function from its 
transform at the end of a problem. 

exercises 

1 Prove that a function fit) is of exponential order if and only if s can be chosen so that 
lim e~* { jf(t) «* 0. If fit) is of exponential order, show that its abscissa of convergence ao is 

the greatest lower bound of all values of s such that lim er’fit) — 0. 

2 Which of the following functions are of exponential order: (a) t n , (b) tan t, (c) ef, (d) 
cosh t, (e) l/t, (f) t*e*l 

3 Prove that, if a piecewise regular function satisfies condition b', it also satisfies condition b. 
(Hint : The proof of this is very much like the proof of Theorem 4.) 

4 Prove that, if a piecewise regular function satisfies condition b, it does not necessarily 
satisfy condition b'. Hint: Consider the function 

\Vn 

5 Prove that, if fit) is piecewise regular and of exponential order, then fit) dt is also piece- 
wise regular and of exponential order. Show also that, if ao and on are, respectively, the 
abscissas of convergence of fit) and j* fit) dt and if ao *= 0, then ai g ao. Is it necessarily 
true that aj g ao if ao < 0? 

7.2 

The genera! method 

The utility of the Laplace transformation is based primarily upon 
the following three theorems; 

THEOREM 1 

+ c*/ 2 (0} - Cl £{/x(f)} + c 2 £(f 2 (t)} 

PROOF To prove this, we have by definition 

£{cifi(t) -f cM)} = f* [c,/x(<) + Cif 2 (t)]e~ st dt 

- j 0 dt + c 2 J 0 f 2 (t)e~ 3t dt 

— Ci£{fi(t)\ -f c 2 £{/ 2 (i)} as asserted. 


SEC. 7.2 


THE GENERAL METHOD 


233 


The extension of this theorem to linear combinations of more than 
two functions is obvious. 


THEOREM 2 

If f(t) is a continuous, piecewise regular function of exponential order whose 
derivative is also piecewise regular and of exponential order and if /(f) approaches 
the limit /(0 + ) as t approaches zero from the right, then the Laplace transform of 
f(t) is given by the formula 

£{/'(<)} = «£{/(*)} -/( 0+) 


provided s is greater than the abscissa of convergence of f(t). 

PROOF To prove this, let us suppose for definiteness that there is a single 
point, say t — to, where, though /(f) is itself continuous, its derivative has a finite 
jump, as suggested by Fig. 7.1. Then, by definition, 


Mm) - f‘m°- 


= Hm [ [" V(<)e"“ dt + £ if] 

ii.«2,Sr-»0 LJSl Jto+6, J 


If we use integration by parte on these integrals, choosing 
u = tr* 1 dv — f(t) dt 

du — —ser^dt v = f(t) 


we have 

Mm) - r<r“f(0 r~ ! ' + S 

Mi,ar»o L J Ui r Jti 

b-—* CO 



FIGURE 7.1 
A continuous 
function whose 
derivative has a 
point of 
discontinuity. 


°T to 


!/'(*> 



234 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


In the limit the two integrals which remain combine to give precisely 
s f”me-*dt = s£{M} 

Similarly, the first evaluated portion yields 
*-*/(<«-) - /(0 + ) 
and the second yields simply 

because, since /(f) is of exponential order, s can be chosen sufficiently large [i.e., 
greater than the abscissa of convergence of /(f)] that the contribution from the 
upper limit is zero. Now /(f) was assumed to be continuous. Hence at to (as at all 
other points) its right- and left-hand limits must be equal. Therefore, the terms 
and ~e-“«f(t 0+ ) 

cancel, leaving finally 

£{/'(f)} = s£{f(t)} ~/(0 + ) as asserted. 

The extension of the preceding proof to functions whose deriva- 
tives have more than one finite jump is obvious. The extension 
of the theorem to the relatively unimportant case in which /(f) 
itself is, permitted to have finite jumps is indicated in Exercise 3. 

COROLLARY 1 

If both /(f) and /'(f) are continuous, piecewise regular functions of exponential 
order and if /"(f) is piecewise regular and of exponential order, then 

£{/"(*) } - s 2 <e{/(f)} - 5/(0+) - /' ( 0 +) 

where /(0+) and /'(0+) are, respectively, the values that /(f) and /'(f) approach 
as t approaches zero from the right. 

PROOF This result follows immediately by applying Theorem 2 twice to /"(f) : 

£{/"(*)} =£{[/W} = s£\f'(t)} -/'( 0+) 

- «[«£{/(<) I -/(0 + )]~/'(0+) 

= s 2 £|/(f) } — 6/(0+) — /'(0+) as asserted. 

The extension of this result to derivatives of higher order is 
obvious (Exercise 1). 


THEOREM 3 

If f(t) is piecewise regular and of exponential order, then the Laplace transform 
°f ff f( f ) dt is given by the formula 

£ { f'mdl } = Ljel/WI +i f°md t 

PROOF To prove this theorem, we have by definition 

£ { jt'/CO dt } = f [ £ f( x ) dx ] e_8t dt 

where the dummy variable, x has been introduced for convenience. If we integrate 


SEC. 7.2 


THE GENERAL METHOD 


235 


the last integral by parts, with 

u — f(z) dx dv — e~ st dt 

du = f(t) dt v — ^ 

we have £ { f*}{t) dt } = j^~ J* f(x) dx jj" + j J™ f(t)e- u dt 

Since f(t) is of exponential order, so, too, is its integral (Exercise 5, Sec. 7.1). 
Hence s can be chosen sufficiently large that the integrated portion vanishes at 
the upper limit, leaving 

£ | jf*/(0 dt} = - f a °f(x) dx + ~£\f(t) } as asserted. 


The extension of this result to repeated integrals of f(t) is obvious 
(Exercise 2). 

Although we need many more formulas before the Laplace 
transformation can be applied effectively to specific problems, 
Theorems 1, 2, and 3 allow us to outline all the essential steps in 
the usual application of this method to the solution of differential 
equations. Suppose that we are given the equation 
ay" + by ' + cy = f(t) 

If we take the Laplace transform of both sides, we have by 
Theorem 1 


a£{y"\ + b£{y'\ + c£{y\ = £{/(*)} 


Now applying Theorem 2 and its corollary, we have 
a[s 2 £{2/} - sy 0 - y' 0 ] + b[s£{y\ - y 0 ] + c£{i/} = £{f(t)} 

where yu and y' d are the given initial values of y and y' . Collecting 
terms on £ { y } and then solving for £{?/}, we obtain finally 

nr, i = + (as + b)y o + gyp 

<£{2/1 as 2 ■+ 6s + c 


Now/(0 is a given function of t; hence its Laplace transform 
(if it exists) is a perfectly definite function of s (although as yet 
we have no specific formulas for finding it). Moreover, yn and 
y'o are definite numbers, known from the data of the problem. 
Hence the transform of y is a completely known function of s. 
Thus if we had available a table of transforms, we could find in it 
the function y(t) having the right-hand side of the last equation 
for its transform, and this function would be the formal solution to 
our 'problem, initial conditions and all. The formal solution could 
then be substituted into the given differential equation to verify 
that it was indeed the genuine solution. 

This brief discussion illustrates the two great advantages of 
the Laplace transformation in solving linear, constant-coefficient 
differential equations: first, the way in which it reduces the prob- 
lem to one in algebra; second, the automatic way in which it takes 


236 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


care of initial conditions without the necessity of constructing a 
general solution and then specializing the arbitrary constants it 
contains. Clearly, our immediate task is to implement this process 
by establishing an adequate table of transforms. 


EXERCISES 

1 Show that £{/'") = s 3 £|/} — s 2 / 0 — sf' n — f”. What is £{/<”>}? 

2 Show that 

£ | £ £m dt dt} = i £{/} + i jT°/«) dt + - s £ £ m a <u 

3 If /(f) satisfies all the conditions of Theorem 2 except that it has an upward jump of magni- 
tude Jo at t m t 0 , show that 


4 Show that 


£{/'«)} — s£{/} o 


£{/(af)} =-£{/«)! 


6 a Given £{cos t) = s/(s 2 + 1), use the result of Exercise 4 to determine £{cos afj. 
b Given £ {sin t \ — 1/(8* + 1), use the result of Exercise 4 to determine £ {sin. at) . 

6 Explain how the Laplace transform can be used to solve a system of simultaneous linear 
differential equations with constant coefficients. In particular, given that y — y o and z = zo 
when t - 0, obtain formulas for the Laplace transforms of y and z if 

dv dz 

ai-ir + biy + ci~ + d\z - fi(t) 
dt dt 

~ + &n2/ + Cj — + diz — /a(i) 

7 The function 

S[/(f)] » £m sin nt dt n - 1, 2, 3, . . . 
is called the sine transform of /(f). Show that 

S(f") = -n*S(f) + n[/(0) - (tDVWJ 

8 The function 

cmm J 0 f{t) cos nt dt n = 0, 1, 2, . . . 

is called the cosine transform of /(f). Obtain a formula expressing C(/") in terms of C(/). 

9 Let T[f{t)\ be a general integral transform 

T[fm - £ f(t) K fat) dt 

where K(s,t ) is the so-called kernel of the transformation. Obtain conditions on K(s,t ) so 
that T if) and T if") contain no terms involving the evaluation of / or any of its derivatives. 
Find at least one kernel satisfying these conditions. 

10 If /(f) and fit) are both piecewise regular and of exponential order and if fit) is continuous 
and/(0 + ) = 0, show that as s becomes infinite £{/(f)} tends to zero at least as rapidly as 
1/s 2 . Can this result be generalized? 


SEC. 7.3 


THE TRANSFORMS OF SPECIAL FUNCTIONS 


237 


7.3 

The transforms of special functions 


Among all the functions whose transforms we might now think 
of tabulating, the most important are the simple ones 

e -at CQS fa s j n fn 

and the unit step function, 



shown in Fig. 7.2. Once we know the transforms of these func- 


FIGURE 7.2 
The unit step 
function u(t). 


|«(0 


tions, nearly all the formulas we shall need can be obtained 
through the use of a few additional general theorems which we 
shall establish in the next section. The specific results are the 
following: 


FORMULA 1 

£{e~ ai } 


1 

s + a 


FORMULA 2 

£{cos bt] = g -~ rF2 


FORMULA 3 


<£ { sin bt } ; 


s 2 b“ 


FORMULA 4 


£{t»} = 


f r(w + 1) 

\ s" +1 


n > —1 

n a positive integer 


FORMULA 5 


£'{«(<)} = ■ 

To pro 

_ ^ s + a) . |o - 


To prove Formula 1 we have simply 

e -(*+“>* |» 1 


if s + a > 0 


238 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


( 1 ) 


( 2 ) 

(2a) 


( 3 ) 


To prove Formula 2, we have 

e -H 


£{cos bt) = j™ cos bt e~ at dt = . —q—p (— s cos bt + b sin hi) 


s 2 + 6 2 v 

s 

- s 2 + 6 2 

To prove Formula 3, we have 


if s > 0 


£{sin bt} = jf sin bie~‘ 


dt — — (— s sin bt — b cos bt) | 


if s > 0 


b 

s 2 + b 2 

Before we can prove Formula 4 it will be necessary for us to 
investigate briefly the so-called gamma or generalized factorial 
function defined by the equation 


T(x) = f” e-V-'dt 


This improper integral can be shown to be convergent for all 
x > 0. 

To determine the simple properties of the gamma function 
and its relation to the familiar factorial function 
to! = to(to — 1) • ‘ • 3 ■ 2 • 1 

defined in elementary algebra for positive integral values of to, let 
us apply integration by parts to the definitive integral (1), taking 


Then T(x) 


dv = t^dt 
t x e~‘ 


du — — e~‘ dt 

1“ + - (“ tr*P dt 
10 X Jo 


t x 


Under the restriction x > 0, the integrated portion vanishes at 
both limits. By comparison with (1), it is clear that the integral 
which remains is simply T(a: + 1). Thus we have established the 
important recurrence relation 

r( *) = Efr±i) ,> 0 


* r (®) = t(x + i ) 

Moreover, we have specifically 

rco ~ jo e ~ e dt = ~ e ~ i |; “ * 

Therefore, using (2a), 

r( 2 ) = i • r(i) = 1 
r(3) = 2 • r(2) = 2 • 1 = 2! 
r(4) = 3*r(3) — 3*2! = 3! 
and in general 

r(n+ 1) = to! to = 1,2,3,..-. 


SEC. 7.3 


the transforms of special functions 


239 


FIGURE 7.3 
Plot of the 
function 
V -■ r(*). 


The connection between the gamma function and ordinary- 
factorials is now clear. However, the gamma function constitutes 
an essential extension of the idea of a factorial, since its argument 
x is not restricted to positive integral values but can vary con- 
tinuously over any interval which does not contain a nonnegative 
integer. 

From (2) and the fact that T(l) = 1, it is evident that F(®) 
becomes infinite as x approaches zero. It is thus clear that r(x) 
cannot be defined for x — 0, — 1 , — 2, . . . in a way consistent 
with Eq. (2) ; hence we shall leave it undefined for these values of 
x. For all other Values of x, however, T(a;) is well defined, the use 
of the recurrence formula (2a) effectively removing the restriction 
that x be positive, which the integral definition (1) requires. By 
methods which need not concern us here, tables of T(x) have 
been constructed and can be found, usually as tables of log T(x), 
in most elementary handbooks. Because of the recurrence formula 
which the gamma function satisfies, these tables ordinarily cover 
only a unit interval on x, usually the interval 1 <j x 2. A plot 
of r(a;) is shown in Fig. 7.3. 



What is the value of I ■■ 


■Jt'vs. 


»’ dz? 


This integral is typical of many which can be reduced to the standard form of the gamma 
function by a suitable substitution. In this case it is clear on comparing the given integral with 
(1) that we should let 

3 = tVi dz = \ir* dt 

getting I = jf “ dt) = H f* &-HT* dt = %T(}A) 

Since T(H) cannot be found in the usual table, which lists r(a:) only for 1 ^ r g 2, it is neces- 
sary to use the recurrence relation (2) to bring the argument of the gamma function into this 


THE LAPLACE TRANSFORMATION 


/1\ X r(K) 

W 3 M 


- (0.88623) = 0.59082 1 


Returning now to Formula 4, we have 
£{("} - f" t n <r"dt 

In an attempt to reduce this to the standard form of the gamma 
function, let us make the substitution 


Then 




z n e~* dz 


_ r(» + 1) 

S n+1 

Since F(n + 1) = n! when n is a positive integer, this establishes 
the second part of Formula 4 also. 

It is interesting to note that, if n is negative, 

s£W = E£±Ji 

is not bounded as $— > <*. Hence, according to Corollary 2, 
Theorem 5, Sec. 7.1, this function of s is not the Laplace trans- 
form of a piecewise regular function of exponential order. This, 
of course, is obvious, since when n is negative, t n , though of 
exponential order (with abscissa of convergence ao — 0), is not 
bounded in the neighborhood of the origin and so is not piecewise 
regular. It can be shown, however, that the improper integral 
defining £{t n } exists for n > —1 although it does not exist for 
n ^ — 1. Formula 4 must therefore be qualified by the restriction 
n > — 1. 

Formula 5 can be obtained immediately by taking n — 0 in 
Formula 4. 


EXAMPLE 2 

What is the Laplace transform of ainh W. 
Since sinh hi — (e M — e" M )/2, we have 


cCfsinh bt) - £ ( - -■ — ) = - (— — \ = — A__ 

} 2 j 2 \s - b s +b/ s* - b* 

The analogy with Formula 3 for the transform of sin bt is apparent. 


EXAMPLE 3 

If £[y(0) = (s + l)/(« s + s — 6), what is y(t)? 

None of our formulas yields a transform resembling this o 
partial fractions, we can write 


!. However, using the method of 


s + 1 


3 + 1 

5 + s - 6 (s — 2) (s + 3) ‘ 


A B = A(s + 3) +B( s - 2) 
s'— 2 ■ a + 3 == (s — 2) (s + 3) 


t Actually the value of r(J£) is known exactly and in fact is equal to Vx 
(Exercise 10). Hence, in thiB example I = Vv/3. 


SEC. 7.3 


THE TRANSFORMS OF SPECIAL FUNCTIONS 


241 


For this to be an identity we must have 

s + 1 - A (a + 3) + B(s - 2) 

Setting s = 2 and s = —3 in turn, we find from this that A = %, B — %. Hence 

£Wl)| -s(dh + dh) 

Formula 1 can now be applied to the individual terms, and we find 

y(t) = H( 3e M + 2e-“) 


EXAMPLE 4 

Solve for y(t) from the simultaneous equations 

y' + 2y -1- 6 J* z dt - -2 u(t) 


y' + z' + z - 0 

if y 0 sa —5 and z<> — 6. 

We begin by taking the Laplace transform of each equation term by term : 


laJBltf} +5J + 2£|y| + --£[ 2 } - -- 


[s£!y} + 5] + [s£{ 2 ) - 6] + £{z\ = 0 

Obvious simplifications then lead to the following pair of linear algebraic equations in the trans- 
forms of the unknown functions y(t) and z(t): 

(«* + 2s)£\y\ + 6£ \z) - -2 - 5s 
»£{?/} + (s + l)JBfa} - 1 

Since it is y (t) that we are asked to find, we solve these simultaneous equations for £ { y } , getting 


&{y\ ■■ 


I s s + 1 | 

Applying the method of partial fractions to this expression, we have 


_ -5s 8 - 7s - 8 _ 2 _ 

l2/i s 3 + 3s 8 - 4s = s s - 1 ~ 
Finally, taking the inverse of each of these terms, we find 
y(t) - 2u(t) - 4e‘ - 3e~ 41 


3 

s + 4 


EXERCISES 

1 What is £ { cosh bt } ? 

2 What is £ {cos (at +6)1? [Hint: First express cos (at + 6) as the difference of two terms.] 

3 What is £{cos 8 bt] ? (Hint: First express cos 8 bt as a function of 2bt.) 

4 What is £{(« + l) 8 }? 

5 Find the inverse of each of the following functions: 

1 \ 1 2s + 3 s +3 

s + 3 s 4 s 8 + 9 s* + 9 (s + 1) (s — '3) 

6 Find the solution of each of the following differential equations: 

a y" + 4y‘ - by = 0 J/a = 1, Vo ~ 0 

b y" — 4y - 0 y 0 — —1, y' B = 1 



242 


TOE LAPLACE TRANSFORMATION 


CHAP. 7 



Further general theorems 

We are now in a position to derive a number of theorems that 
will be of considerable use in the application of the Laplace 
transformation to practical problems. We begin with a result 
which allows us to infer the behavior of a function /(£) for small 
positive values of t from the behavior of £{f(t ) } for large positive 
values of s. 

THEOREM 1 

If f(t) and f'(t) are both piecewise regular and of exponential order, then 
lim s£ (/(£)} = lim /(f) = /( 0+) 

a->» f-»0+ 

PROOF For convenience we shall prove this under the additional assumption 
that f(t) is continuous, leaving as an exercise the proof under the less restrictive 
conditions of the theorem as stated. We may thus begin with the result of Theorem 
2, Sec. 7.2, namely, 

£{/'(£)} = s£{f(t)} -/( 0+) 

Hence, taldng the limit of each side, 

(1) lim £{f(t ) } = lim &£ {/(f) } - /( 0+) 


SEC. 7 A 


FURTHER GENERAL THEOREMS 


243 


However, under the conditions of the theorem, it follows from Corollary 1, 

Theorem 5, Sec. 7.1, that 

lim £{f(t)} = 0 fl| : 

Therefore, from (1), 

lim s£{f(i)} = /( 0 + ) as asserted. 

An analogous result which allows us to infer the behavior of a 
function f(t) for large positive values of t from the behavior of 
£\f(t)} for small values of s is contained in the following theorem : 

THEOREM 2 

If f(t) and f'(t) are both piecewise regular and of exponential order and if the 
abscissa of convergence of f(i) is negative, then 

lim &C [/(*)} = lim f(t) |j| 

5— >0 t—+ + 00 'v 

provided these limits exist. 

PROOF Here, as in the proof of Theorem 1, we shall base our argument on the 
additional assumption that f(t) is continuous. Then again we may take limits in 
the result of Theorem 2, Sec. 7.2, getting 

lim £{f(t)} = lim s£{f(t)} - /(0+) # 

a— >0 a— »0 >• 

or 

(2) lim s£{f(t)} = lim £{f(t) } +/(0+) 

3 — >0 8— >0 

But lim £ {/'(£)} = lim f f'(t)e-‘ l dt 

s—>0 s— >0 

and under the conditions of the present theorem we can invoke Theorems 6 and 1, 

Sec. 7.1, and take the limit on the right inside the integral sign. Thus 

lim £{/'(()] = jTV( l)0|m e-‘) dl - f‘f(t ) dl - /(() £ 

= lim/(l) -/(0+) 

Substituting this into (2) we have finally 

lim s£{f(t)} = [lim f(t) -/( 0+)] +/(0+) 

8 — >0 t — > 

= lim /(f) as asserted.* 


* In realistic applications of this theorem, £ {/(f) } will be known, hut fit) and 
its abscissa of convergence will be unknow'n. Hence it is desirable that condi- 
tions for the use of the theorem be expressed in terms of £ {/(f) ) rather than 
/(f). This can be done, since it is possible to show' that Theorem 2 cannot be 
applied if there is any value of s with nonnegative real part for which s£ { /(f) } 
is unbounded, but can be applied if no such value exists. For example, even 
though lim s/(s 2 + 1) exists, Theorem 2 cannot be applied to £{/(f)! = 
s->0 , 

l/(s 2 + 1), since this is unbounded for the values s = ±i = 0 ± i. In this 
case, of course, fit ) = sin f, and clearly lim sin f does not exist. 


244 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


When the Laplace transform of an unknown function /(f) 
contains the factor s,f it is often convenient to find /(f) by means 
of the following theorem: 

THEOREM 3 

If /(f) is piecewise regular and of exponential order, if £{/(f)} — s$(s), and if the 
inverse of the factor <j>(s) is continuous for t > 0, then 

m = /(£-{«(*)} 

PROOF To prove this theorem, let F(t) ss £ -1 {«f>(s)} be the function which 
has <j}(s) for its transform, If F(f) is continuous, as assumed, then, by Theoi-em 2, 
Sec. 7.2, 

£{F'(f)} - s£{F(t) } - F(0+) - scKs) - F( 0+) 

But, by Theorem 1, 

F(0+) = lim s£{F(t)\ - lim s<f>(s) — 0 

where the last step follows from Corollary 1, Theorem 5, Sec. 7.1, since $(f>(s) is 
the transform of the function /(f), which, though unknown, is assumed to be 
“respectable.” Hence, 

£{/(f)} -*{*"©} 

since each is equal to s<f>(s). Therefore J 

/(f) = = j^irM^s)} as asserted. 

EXAMPLE i 

What is £~ 1 (s/(s s + 4) j ? 

By Formula 2, Sec. 7.3, we see immediately that the required inverse is f(t) — cos 2f. 
However, it is interesting that we can also obtain this result by suppressing the factor s, finding 
the inverse F(t) of the remaining portion of the transform, namely, 

1 

s* +4 

and then differentiating this inverse according to Theorem 3: 


f 1 ) d ( 

sin2f\ 

s 2 + 4] dt \ 

2 / 


as before. The usual applications of this theorem are, of course, not of this trivial character. 


f This can always be arranged, of course, by multiplying and dividing the 
transform by s; that is, <f»(s) ^ s[$(s)/s]. 

t This, of course, assumes the “obvious” theorem that, if two functions 
have the same transform, they are identical. This is strictly true if the func- 
tions are continuous. If discontinuities are permitted, the most we can say is 
that two functions with the same transform cannot differ over any interval 
of positive length, although they may differ at various isolated points. A 
detailed discussion of this result (Lerch’s theorem) would take us too far 
afield. 


FURTHER GENERAL THEOREMS 


When the Laplace transform of an unknown function /(f) 
contains the factor l/.s,f it is often convenient to find f(t ) by 
means of the following theorem: 

THEOREM 4 

If f(t) is piecewise regular and of exponential order and if £{/(f)} = 4>(s)/s, then 

fit) = f* dt 

PROOF To prove this theorem, let F(t) = £ _1 { $(s)} be the function which has 
tf>(s) for its transform. Then, by Theorem 3, Sec. 7.2, 

j'*m dt = + i J°m dt = jsimi - ^ 

Thus both f(t ) and J Q F(t ) dt = £ -1 {0(s)} dt have <£(s)/s for their Laplace 

transform, and so must be equal, as asserted. 

EXAMPLE 2 

What is £“Ml/s(s 2 + 4)1? 

Here, using the last theorem, we first suppress the factor I/s, getting 


<t>(s) = 


By Formula 3, Sec. 7.3, the inverse of this is F(t) = 
ing F(t) from 0 to t: 


m - 


cos 2 1 1 1 
4 |o ' 


l 2 1. Finally, we obtain /(d) by integrat- 
- cos 2d 


One of the most useful properties of the Laplace transforma- 
tion is contained in the so-called first shifting theorem : 


THEOREM 5 

PROOF By definition, 

= / o '[*-!«0k-“*= [ 0 ‘ merg'd! 

and the last integral is in structure exactly the Laplace transform of f(t) itself, 
except that s + a takes the place of s. 

In words, Theorem 5 says that the transform of e~ ai times a func- 
tion of t is equal to the transform of the function itself, with s re-placed 
by s + a. 

As a tool for finding inverses, this theorem asserts that, if we 
reverse the substitution s — * s + a, that is, if we replace s by 
s — a, then the inverse of the modified transform <f>(s — a) must 
be multiplied by e~ al to obtain the inverse of the original trans- 
form. This procedure is summarized in the following result: 


t This can always be arranged, of course, by multiplying and dividing the 
transform by s; that is, — (l/s)[s<£(s)]. 


246 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


COROLLARY 1 


£ -1 {4>(s)} = e~ at £~ l {(t>(s - a)} 

By means of Theorem 5 we can easily establish the following 
important formulas: 


FORMULA 1 


£{e -0 ‘ cos i»f} — - 


FORMULA 2 


£{e~ a ‘sin&f} 


FORMULA 3 


f r(n + 1) 

\ (s + a) n+1 

I m 

, (s + o) n+1 


n a positive integer 


If £{#) = (2s + 5)/(s* + 4s + 13), what is y ? 

By obvious manipulations we obtain 

pl , 2(8 + 2) + 1 r s + 2 1 

12/1 “ (s + 2)* + 3> " 2 [(s + 2)* + 3 2 J + 
Hence, by Formulas 1 and 2, 

y = 2e“*‘ cos 3( + sin 3 1 

EXAMPLE 4 

is the solution of the differential equation 

y" + 2y' + y = 

for which 3/0 = 1 and 2/0 = —2? 

Transforming both sides of the given equation, we have 

(s 2 £{j/} - s + 2) + 2(s£|t/} — 1) + £{y] = - 
(s* + 2s + l)£{y} « - 


l f 3 
J [(« + 2) 2 + 3*_ 


' ’ (8 + 1)* (S + 1) S 

By Formula 3, the inverse of the first fraction in £{y} is t 3 e~‘. To find the inverse of the 
second fraction we can write it in the form 

s + 1 - 1 1_ 1 

(s + l) s ~ s + 1 (s + l) 2 

and take the inverse of each term, or we can suppress the factor s, take the inverse of what 


SEC. 7.4 


FURTHER GENERAL THEOREMS 


247 


remains, and differentiate this result. By either method we obtain immediately e~‘ - te~‘. Hence 



In this example the characteristic equation of the differential equation has repeated roots, 
and moreover the term on the right is a part of the complementary function; yet neither of 
these features requires any special treatment in the operational solution of the problem. This is 
another of the many advantages of the Laplace transform method of solving linear differential 
equations with constant coefficients. 

In some problems a system which becomes active at £ = 0, 
because of some initial disturbance, is subsequently acted upon 
by another disturbance beginning at a later time, say t — a. The 
analytical representation of such functions and the nature of 
their Laplace transforms are therefore a matter of some impor- 
tance. To illustrate, suppose that we wish an expression describ- 
ing the function whose graph is shown in Fig. 7.4a, the curve 


V 

m 

\j 

m 

i'(t) 

a I 

t 

a t 

a t 

(a) 

(&) 

(c) 

( d ) 


FIGURE 7.4 

Plot describing the graph of a function which has been translated and “cut off.” 


being congruent to the right half of the parabola y — t 2 shown in 
Fig. 7.46. It is not enough to recall the translation formula from 
analytic geometry and write /*(£) — (t — a) 2 , because this equa- 
tion, even with the usual qualification that /(£) = 0 for t < 0, 
defines the curve shown in Fig. 7.4c and not the required graph. 
However, if we take the unit step function and translate it a units 
to the right by writing u(t — a), we obtain the function shown 
in Fig. 7 Ad. Since this vanishes for t < a and is equal to 1 for 
t > a , the product (£ — a) 2 u(t — a) will be identically zero for 
t < a and will be identically equal to (£ — a) 2 for £ > a and hence 
will define precisely the arc we want. More generally, the 
expression 
/(£ — a)u(t — a) 

represents the function obtained by translating/© a units to 
the right and "cutting it off/' i.e., making it vanish identically 
to the left of a. 

EXAMPLE 5 

What is the equation of the function whose graph is shown in Fig. 7.5 a? 

Clearly we can regard this function as the sum of the two translated step functions shown 

in Fig. 7.56. Hence its equation is 

u(t — a) — u(t — b) 


248 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


FIGURE 7.5 
Plot showing how 
two step func- 


| m 


tions can be 
combined to give 
a rectangular 
pulse, or “filter 
function.” 



(a) (6) 


Although the function shown in Fig. 7.5a is not ordinarily given a name, it could appropriately 
be referred to as a filter function. For when any other function is multiplied by this “filter func- 
tion” it is annihilated completely, i.e., reduced identically to zero, outside the “pass band,” 
a < i < b, and reproduced without any change whatsoever for values of t within the “pass 
band.” 


EXAMPLE 6 

What is the equation of the function whose graph is shown in Fig. 7.6? 


FIGURE 7.6 
A graph con- 
sisting of 
straight-line 
segments. 



To obtain the segment of this function between 1 and 2 we must multiply the expression 
■2(< — 1) by a factor which will be zero to the left of 1, unity between 1 and 2, and zero to the 
right of 2. By Example 5, such a function is u(t — 1) — u(i — 2). Hence 
2 (t - l)[«(f - 1) - u(t - 2)] 

defines the given function between 1 and 2 and vanishes elsewhere. Similarly 
(-< + 4 )[«(< - 2) - u(t - 4)] 

defines the given function between 2 and 4 and vanishes elsewhere. The complete representation 
of the function is therefore 

2(i - l)[u(f - 1) - u(f - 2)] + {-t + 4 )[u(t - 2) - u(t - 4)] 

=* 2(f - l)tt(« - 1) - 3 (t - 2)u(t -2 ) + (t - 4)u(t - 4) 

The transforms of functions that have been translated and 
cut off are given by the so-called second shifting theorem : 


THEOREM 6 

£{f(t — a)u(t — a).} = e~ a “£{f(i)} a ^ 0 
PROOF To prove this, we have by definition 

£{f(t — a)u(t — a)} = f(l — a)u(t — a)e~ tl dt — J a f(t — a)e~ at dt 

since the integration effectively commences not at t — 0 but at t = a because 
f(t — a)u(t — a) vanishes identically to the left of this point. Now let t — a = T, 


SEC. 7.4 


FURTHER GENERAL THEOREMS 


249 


( lt — clT. Then the last integral becomes 

f(T)e- s< - T+a) dT — e~ as j” f(T)e~‘ T dT = e~ a, £ { /(f) } as asserted. 

Before Theorem 6 can be applied, it is necessary that the 
function being transformed be expressed in terms of the binomial 
argument t — a which appears in the unit step function. This 
will not often be the case; so it will frequently be necessary to 
alter the form of the function, as originally given, before it can 
be transformed. In many cases this can be done by inspection. 
On the other hand, we can always proceed in the following gen- 
eral way. Suppose we wish to transform 

f(t)u(t — a) 

As it stands, this cannot be handled by Theorem 6; so we rewrite 
it in the form 

f[(t — a) + a]u(t — a) ss F(t — a)u(t — a) 
whei'e F(t — a) = /[(f — a) + a] = /(f), or F(t) = f(t -f- a). Now 
Theorem 6 can be applied, and we have 
£{f(t)u(t - a)} = £{F(t - a)u(t - a)} = e~ a *£{F(t)} = e~ as £{f(t + a)} 
Thus we have established the following useful result : 

COROLLARY 1 

£{f(t)u(t - a) J = e~ aa £{f(t + a)} 

As a tool for finding inverses, it is convenient to restate 
Theorem 6 in the following form: 

COROLLARY 2 

If JB- 1 {.*(«)} - /(f), then £->{r« *(«)} = f(t - a)u(jt - a). 

In words, this says that suppressing the factor e~ a ‘ in a transform, 
requires that the inverse of what remains be translated a units to 
the right and cut off to the left of the point t — a. 

EXAMPLE 7 

What is the transform of the function whose graph is shown in Fig. 7.7? 

The equation of this function is obviously 

F(t) = -« 2 - 3 1 + 2)[«(< - 1) - u(t - 2)] 

- -fmit - 1) +mu(t - 2) 

where fit) — t 2 — 3< + 2. However, the form of /(f) is such that Theorem 6 cannot be applied 

1/(0 


FIGURE 7.7 
A parabolic 
pulse. 


fit) = -(f 2 -3f+2) 


250 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


directly to either term in the expression for F(0. Hence we use Corollary 1, observing that 
fit + 1) = [« + 1? - 3(i + 1) + 2] = ** - t 
and fit + 2) = {(t + 2) s - 3(f + 2) + 2] - t* + t 

The required transform is, therefore, 


EXAMPLE 8 

Find the solution of the equation y' -\- Zy + 2 y dt — fit) for which ya — 1, if fit) is the 
function whose graph is shown in Fig. 7.8. 


+e- 2 *£(0 + t} 





FIGURE 7.8 
A rectangular 
pulse. 


In this case f{i) = 2u(t — 1) — 2 u(t — 2), and thus the differential equation can be written 
y' + 3y + 2 £ ydt - 2 u(t — 1) — 2it(< - 2) 

Taking transforms, we have 

2 2c- 5 2e~ 2 ‘ 

(s£{2/J - 1) + 3J3 {?/} + -£,\y] 

s s s 

or (s 2 + 3s + 2)£{y} = 2e _s — 2c -2 * + s 

i „ , ' s , 2e“* 2e- 25 

and £{j/) = 


(s + l)(s + 2) (» + l)(s + 2) (s + l)(s + 2) 

The first term can be written 


2 1 

s+2 s+l 

Hence its inverse is 2e -2 ' — e~‘. If the exponential factors are suppressed in the second and third 
terms of £{y}, the algebraic portion which remains can be written 



and the inverse of this is 2e _< — 2e~ s< . However, because the factors e“" and e“ 2 ’ were neglected, 
it is necessary to take the last expression, translate it one unit to the right and cut it off to the 
left of t = 1, and also translate it two units to the right and cut it off to the left of t — 2 in order 
to obtain the inverses of the original terms. This gives for y 

y = (2e- 2 ‘ - e~‘) + 2( e -< t -» - e~w-»)u{t - D - 2(e~< 1 - 2 > - e~ 2 <‘- 2 >)u(< - 2) 
Plots of these three terms, as well as of their sum, that is, y itself, are shown in Fig. 7.9. 

We have already made repeated use of Theorems 2 and 3 
of Sec. 7.2 on the transforms of derivatives and integrals. On 
the other hand, it is sometimes convenient or necessary to con- 
sider the derivatives and integrals of transforms. The basis for 
this is contained in the next two theorems. 


SEC. 7.4 


FURTHER GENERAL THEOREMS 


251 


FIGURE 7.9 
Plot showing the 
solution of 
Example 8. 




THEOREM 7 

If /(f) is piecewise regular and of exponential order and if £ {/(f) \ = <p(s), then 
= -*'(«). 

PROOF By definition we have 

£{/«)} “ j” Me -’ 1 dt = 0(s) 
and, differentiating this with respect to s, we obtain 
~ f*Me~ at dt = 4 >'(s) 

Now under our usual assumptions that /(f) is piecewise regular and of exponential 
order, the product' f/(f) also satisfies these conditions. Hence, by Theorem 6, 
Sec. 7.1, the integral which results when £{/(f)} is differentiated partially with 
respect to s, namely, 

f 0 ~tj{t)e~ at dt 

converges uniformly. Therefore, according to Theorem 3, Sec. 7.1, the integral 
for £j/(f) ) can legitimately be differentiated with respect to s inside the integral 
sign. Thus, performing the differentiation, we have 

<*»'(«) = f Q f(t)[~te~ 8t ] dt 

or J” [ if(t)]e~ al dt a £{f/(f)| = -<f>'(s) as asserted. 

By taking inverses in the assertion of Theorem 7 and then 
solving for /(f), we obtain the following useful result: 


252 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


COROLLARY 1 

If JB {/(<)} = *(«),theu/(f) *=£-*{*(«)} = — 

This is often helpful when the inverse of a transform cannot 
conveniently be found but the inverse of the derivative of the 
transform is known. The extension of Theorem 7 and its corol- 
lary to repeated differentiation of transforms is obvious. 


THEOREM 8 

If. f(t) is piecewise regular and of exponential order, if £{/(<)} = rf>(s) , and if 
f{i)/t has a limit as t approaches zero from the right, then 

* {^} - 1 " +<»)*' 

PROOF By definition 

♦(») =£ 1/(0 ! - 

Hence, integrating from sto w, we obtain 

4>(s) ds - J* [ J* f(.t)e~ at rff] ds 

Now under the assumption that lim exists and that f(t) itself is piecewise 
t-*o + t 

regular and of exponential order, it follows from Theorems 6 and 2, Sec. 7.1, 
that the integration with respect to s can be performed inside the integral sign, 
i.e., that the order of integration in the repeated integral can be reversed. Hence, 
performing the integration, 

f” <f>(s) ds = jT“ f(t)e~ al ds dt = f* fit) dt 

= /;[ 



By taking inverses in the assertion of Theorem 8 and then 
solving for fit), we obtain the following result: 

COROLLARY 1 

If £{/(*)} = 4>(s), then /(f) s = f£ -1 ds]\ 

This is often useful in finding inverses when the integral of a 
transform is simpler to work with than the transform itself. 
The extension of Theorem 8 and its corollary to repeated integra- 
tion of transforms is immediate. 

EXAMPLE 9 

What is £{f® sin 2t}‘! 

By a repeated application of Theorem 7, we have 


ds® ds 1 \s® + 4 J 


(s- + 4)® 


SEC. 7.4 


FURTHER GENERAL THEOREMS 


253 


EXAMPLE 10 

What is y if£{»! = In [(a + l)/(s - 1)]? 

Using Corollary 1 of Theorem 7, we have immediately 



EXAMPLE 11 


What is £{ (sin kt)/t}1 

By Theorem 8, we have 

a _ f~ £lsin u] ds _ f‘ -JL- 


Tan -1 - 

i - Tan "i 


What is y if £{?/} = s/(s 2 — l) 2 ? 

Using Corollary 1 of Theorem 8, we have immediately 


V = tsr 


if: 


(s’ - l) 2 


1 2(s s - 1) 


-'IKA-.-ii)) 


- e~‘) = 


EXERCISES 

Find the Laplace transform of each of the following functions: 


1 u(t — a) 

3 t*u(t ~ 2) 

5 e 2( u(< - 1) 

7 /<»-{; 

Fig. 7.10. 


0 < i <* 
v < t 


cos ( t — 1)m(£ — 1) 

4 (f 2 - l)tt« - 1) 

6 cos 3 { u(t — 3) 

0 <t < 2 
2 < t 

10 See Fig. 7.11. 


8 /«) - 


FIGURE 7.1 1 



2 sinh t 
- 


= Cot -1 - 
h 


11 


1 — cos 32 


12 


- 1 


254 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


15 / t sin 2t dt 


Find the inverse of each of the following transforms: 

21 22 - 

(« + 2) 4 0 

23 lii 24 _ 

9s 2 + 6s + 5 s 

1 /'Tv, 


> (s + l)(s 2 + 2s + 5) 


(s - l)(s - 2) 
, s 2 - 1 


s — 1 

34 s In b 2 

s + 1 


2s + 3 

(s 2 + 3s + 2) 2 


40 Find the values of /( 0 + ) and of lim /({), if it exists, if £ {/(<) } is: 


s 3 + 6s 2 + 11s + 6 
s 2 + s + 1 


s + 3 

2s 3 - 3s 2 - 2s 


41 Show that, under appropriate conditions, 

lim s[s£ {/(f)} - /(0 + )l = /'(0+> 


lim s[s 2 £{/(f)} - s/(0+) - /'(0 + )l = /"(0+) 

What conditions beyond those of Theorem 1 are necessary for the validity of these results? 
Can the value of (0+) be obtained by an extension of these formulas? 

42 Show that, under appropriate conditions, 

lim s[s£{/(f)} -/(0+)] = lim /'(*) 


What conditions beyond those of Theorem 2 are necessary for the validity of this result? 
Can this result be generalized to the determination of lim / <n) (f) from £ {/(f) } ? 


SEC. 7.5 


THE HEAVISIDE EXPANSION THEOREMS 


255 


Solve the following differential equations: 


43 

44 

45 

46 

47 

48 


80 


y" + 4y' + 3y - e 1 

y" + 4 y ~ cos 2 1 

y" + 3 if +2 y - u(t - 1) 

y" + 4j/' + 4 y - (t~ 2)e~v-»u(t - 2) 

y" + 2y" + y - 0 


Vo = Vo - 1 
Vo = -2, i/o = 1 
Vo = 0, Vo = 1 
Vo = 1, Vo = -1 
Vo - Vo = Vo" - 0, v" 


1 


Prove Theorem 1 without assuming that f(t) is continuous. (Hint: Use the result of Exer- 
cise 3, See. 7.2.) 

Prove Theorem 2 without assuming that f(t) is continuous. (Hint: Use the result of Exer- 
cise 3, Sec. 7.2.) 

Where in the proof of Theorem 2 is use made of the hypothesis that the abscissa of con- 
vergence of /(f) is negative? 


The Heaviside expansion theorems 

The frequent use we have had to make of partial fractions indi- 
cates clearly the importance of this technique in operational 
calculus. It is therefore highly desirable to have the procedure 
systematized as much as possible. The following theorems, 
usually associated with the name of Heaviside, are of great utility 
in this connection: 


THEOREM 1 

If /(f) = «B“ l {p(s)/g(8)}, where p(s) and q(s) are polynomials and the degree of 
q(s) is greater than the degree of p(s), then the term in /(f) corresponding to an 
unrepeated linear factor s — a of q(s) is 

~y~ e at or equally well e at 

q'(a) H J Q(a) 

where Q(s) is the product of all the factors of q(s) except s — a. 


PROOF In the familiar partial-fraction decomposition of p(s)/q(s), an un- 
repeated linear factor s — a of q(s) gives rise to a single fraction of the form 
A/(s — a). Hence, if we denote by h(s) the sum of the fractions corx-esponding 
to all the other factors of q(s), we can write 
p(s) = A 
q{s) s - t 


- + h(s) 


where, since s — a is an unrepeated factor of q(s), h(s) remains finite as s ap- 
proaches a. Multiplying this identity by s — a then gives 


(s - a)p(a) 

«(«) 


P(s) 

q(s)/(s - a) 


— A + (s — a)h(s) 


If we now let s approach a, the second term in the right member vanishes, and 
we have 


A 


P(s) 


q(s)/(s - a) 


256 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


The limit of the numerator here is evidently p(a). The denominator appears as 
an indeterminate of the form 0/0. However, if we evaluate it as usual according 
to L’Hospital’s rule by differentiating numerator and denominator and then 
letting s approach a, we obtain just q'(a). Hence 

a _ Pfo) 

q'(a) 

On the other hand, we could have eliminated the indeterminacy before passing 
to the limit simply by canceling s — a into q(s), which by hypothesis contains 
this factor. Doing this, we obtain the equivalent form of A : 

a _ PM 

Q(a) 

Finally, taking inverses, it is clear that the fraction A/(s — a) gives rise to the 
term 

q'{a) Q(a ) 

in the inveis ef(t), as asserted. 

If q(s ) contains only unrepeated linear factors, then by 
applying Theorem 1 to each factor in turn, we obtain the follow- 
ing useful result : 

COROLLARY 1 

If f(t) = £ _1 {p(s) /?($)} and if q(s) is completely factorable into unrepeated 
linear factors 

(s — a i), (s — a 2 ), . . . ., (s — a») 

then 

y j»te) ^ = v p(«o , 

.4 q'(ai) 


m ■ 


e *>t 


_ V Pi 

A Qt'( 


■{ad 




where Qi(s) is the product of all the factors of q(s) except the factor s — a,-. 


THEOREM 2 

If f(t) — £~ 1 {p(s)/q(s ) } , where p(s) and q(s) are polynomials and the degree of 
q(s) is greater than the degree of p(s), then the terms in f(t) corresponding to a 
repeated linear factor (s — a) r in q(s) are 

f 4> (r -Ha) j. 4> (r - 2) (a) , t , . . . j_£(a) f'~ 2 , ^ iT 1 

[(r 


! j- Z 5Z/ . _i 4 . 

- 1)! ^ (r — 2) ! 1! ^ 


+ - 


11 (r-2)! + ^ (a) (r- 1)!^ 

where 4>(s) is the quotient of p(s) and all the factors of q(s) except (s — a) r . 

PROOF From the familiar theory of partial fractions we recall that a repeated 
linear factor (s — a) r of q(s) gives rise to the component fractions 

j_ Ai , . . . , A r - 1 _j_ 

(* 


e at 


; + 7 


(s — a) % -r (s — a) r_1 " r (s — a) r 

If we let h(s) denote, as before, the sum of the fractions corresponding to all the 
other factors of q($), we have 

P(a) = 0(a) = A x A 2 , , A r -x , A, 

q(s) (s — a) r s — a^is — a) 


+ • 


iT- 1 + (8 - £ 


+ ‘ ' 


+ 


(s - a) r ~ 


(a - «)’ 


+ A(a) 


SEC. 7.5 


THE HEAVISIDE EXPANSION THEOREMS 


25 7 


Multiplying this identity by (s - a) 7 gives 

q!)(s) = Ai(s - a) 7 - 1 + 4 2 (s — a) 7 ' 1 4 ' * * 4 4 r _i(s — o) 4 A r 4 (s — a) r h(s) 
If we put s = a in this expression, we obtain 
<p(a) = A r 

If we now differentiate <p(s), we have 

- Ai(r — l)(s — a,) 7 ' 1 4 A»(r — 2 )(s - a) 7 - 3 4 • • • 

+ A r - 1 4 r(s — a) T - l h(s) 4 (3 — a) 7 h'(s) 

Again setting s — a, we find this time 
4>'(a) = A r ~ 1 

Continuing in this fashion, noting that the first r — 1 derivatives of the product 
(s — a) 7 h(s) will all vanish when s = a, we obtain successively 
<t>"(,a ) — 2!A r -2 
<i>"'{a ) = 3!A r _3 


4 lr - 

A r —Jc : 


l) (a) - (r - l)U a 

.4!^> * = 0,1 r — 1 

/Cl 

The terms in the expansion of p(s)/q(s ) which correspond to the factor (s — a) 7 
are, therefore, 


^(q) 1 ■ 4 > (r - 2 ) (a) 1 ■ 

(r — 1)! s- a^ (r- 2)1 (s - a) 2 _r 
4. ^( fl ) ._ 


1 


1! (s — a) 


4 t 


(s - a) r 


Kecalling that 


£-‘i 


1 t n ~ l e ai 

(s — q) n j (n — 1) l 


it is evident that the terms in y which arise from these fractions are 




4 ><r-*>(a) < te al 


, <f>'(a) F- 2 e?‘ . f-h 0 

+ ■■■ +WT - + 


If we factor out e at from this expression, we have precisely the assertion of the 
theorem. 


THEOREM 3 

If f(t) = £~ 1 {'p(s)/q(s)}, where p(s) and q(s) are polynomials and the degree of 
q(s) is greater than the degree of p(s), then the terms in f(t) which correspond 
to an unrepeated, irreducible quadratic factor (s 4 q) 2 4 b 2 of q(s) are 

— • ( cos bt 4 4>r sin 62) 

where <j> r and fc are, respectively, the real and imaginary parts of 4>(—a 4 ib), 
and ^>(s) is the quotient of p(s) and all the factors of q(s) except the factor 
(s 4 a) 1 4 b\ 


258 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


PROOF From the familiar theory of partial fractions we recall that an unre- 
peated, irreducible quadratic factor (s + a) 2 + b 2 of q(s) gives rise to a single 
fraction of the form 


As + B 
(s + a) 2 + b 2 


in the partial-fraction expansion of p(s)/q(s). If again we let h(s) denote the 
fractions corresponding to all the other factors of q(s) s we can, therefore, write 


p(s) _ Ms) _ + B 

q(s) ~ (s + a) 2 + 6 2 “ (s + a) 2 + b 2 


+ h(s) 


Multiplying this identity by (s + a) 2 + b 2 , we obtain 
Ms) = As -f B 4 [(s -f- a) 2 4 b 2 ]b(s) 

Now put s — —a 4 ib. This value, of course, makes ( s 4 a) 2 4 b 2 vanish; hence 
the last product drops out, leaving 

<j>(— a + ib) = (—a 4 ib)A 4 B 
or, reducing 4 ib) to its standard complex form <p r + 


cj> r -f- i<f»i = ( — a A 4 B) -(- ibA 

Equating real and imaginary terms in the last identity, we find 


4> r = —a A -{- B <pi — bA 


or, solving for A and B, 

A _ M £ _ btf> r 4 a4>i 
b b 


Thus the partial fraction which corresponds to the quadratic factor (s 4 a) 2 4 6 2 
is 

As 4 B 1 fas 4 (b(j) T 4 atfri) 

(s + a) 2 4 b 2 “ b ' (s + a) 2 + b 2 

__ 1 f (s 4 a) 4n b<j) r 1 

b |> + a) 2 4 & 2 (s 4 a) 2 4 b 2 J 
The inverse of this expression is evidently 

{4>iS~ al cos bt 4- 4> r e~ al sin bt) 

Factoring out e~ at now gives the assertion of the theorem. 

There is a fourth theorem dealing with repeated, irreducible 
quadratic factors, but, because of its complexity and limited 
usefulness, we shall not develop it here. Fortunately, many of 
the simpler transforms involving repeated quadratic factors can 
be handled by other means, for instance, the convolution theorem 
of Sec. 7.7. 


EXAMPLE 1 

If £(/} = («* + 2 )/s(s + l)(s + 2), what is /(f)? 

The roots of the denominator are s = 0, —1, —2. Hence we must compute the values of 
p(s) = s 2 + 2 and q'(s ) = 3s 2 4 6s + 2 


SEC. 7.5 


THE HEAVISIDE EXPANSION THEOREMS 


259 


for these values of s. The results are 

p(0) = 2 p(~ 1) =3 p(- 2) = 6 

q'( 0) = 2 q'(-l) - -1 ?'(— 2) = 2 

From the corollary of Theorem 1 we now have at once 


1 +- 


‘ = 1 - 3e~* + 3 


Equally well, of course, we could, have obtained the coefficients in the inverse by suppressing 
each of the factors in turn and evaluating the rest of the fraction at the root associated with the 
suppressed factor. 


If £{y} =* s/(s + 2) 2 (s 2 + 2s + 10), what is yl 

Considering first the repeated linear factor, we identify 


<t>(s) - 


and i j>'(s) = 


2 + 2s -f- 10 

Evaluating these for the root s = —2, we obtain 

*(-2) = -H and <£'( — 2) - Mo 
Hence, by Theorem 2, the terms in y corresponding to (s + 2) 2 are 
(3 - 10i)e~ 2 ' 


,( 1 ^ 1 ). 

\50 5/ 


50 


For the quadratic factor s a + 2s + 10 s (s + l) a + 3 2 , we have 

Hence, 

<p ( — a + ib) = — 1 "b 3i) 


-1 +3» 


-l + 3i _ -1 + 3t __ 13 - 9 i 
{(-1 + 3*) 4- 2] 2 (1 + 3f) 2 “ -8 + 6 i “ 50 


and thus <t> r = l %a, fc - — %o- The term in y corresponding to the factor 
s 2 + 2s + 10 

-<(-9 cos 3f + 13 sin 3f)~j 


>, therefore, 


if 


50 


Adding the two partial inverses, we have finally 

(3 - 10t)s -2 ' , e~‘(— 9 cos 3f + 13 sin 3 1) 


EXERCISES 

Find the functions which have the following transforms: 


s 3 + 6s 2 + 11s + 6 


(s + 2) 2 (s 2 + 1) 


s 4 -f- 4s 3 + 4s 2 — 4s — 5 


(s + l)(s 2 + 4) 
s +1 

(s 2 + l)(s a + 4s + 13) 


(s + l)(s +2) 3 
s -f* 2 

S 4 _ 16s 2 + 100 


260 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


Solve the following differential equations: 

9 y"’ — 2 y" - y' + 2y = u(t — 2) 

10 y'" + 3 y" + 3 y' + y = cosh i 

11 y tv + 22 /'" + 2y" + 2 y' + y = 

12 


a" + 2z' + P y dt - t\ , 

jo a: 0 = —1, x a = 

4x" — 5x' + y — sin 2i i 


Vo - Vd = Vt =1 

2/o = 2/o = 2/o' = 0 

2/o = 2/'o = 2/o' = |/i" = 0 


1 


-1 


13 (D* + £> + 1)* + (jD — 1)2/ - «C0 1 

(D* + 2D + 3)x + (3D* + 4D - 3)j/ -■«(* - 1) j 

14 y' — 3z — 5 \ 

2 / + z' — w =3—2/1 j/o = 1, 2 o = 0, to 0 = 

z +io' - -1 j 

15 In the proof of Theorem 3, verify that, if the identity 

*(«) - As + B + [(* + aY + b*)h(s) 

is evaluated for s = — a — ib instead of for s = — a + ib, the same inverse is obtained. 


7.6 

Transforms of periodic functions 


The application of the Laplace transformation to the important 
case of general periodic functions is based upon the following 
theorem: 


THEOREM 1 


If /(f) is a piecewise regular function of exponential order which is periodic with 
period a, then 


*f/(OI - 


1 — e~ a ‘ 


PROOF By definition, 

£(/(()! = j‘ 

ra r2a r3a 

= jo f^ e ~ al dt + Ja P l ) e ~ s> dt + J 2a dt + * • • 

Now, in the seeond integral, let t — T ■+■ a; in the third integral let t — T + 2a; 
and in general let t = T + na in the (n + l)st integral. In each case dt ~ dT, 
and the new limits become 0 and a. Hence 

£{/(*)} = f Q a KT)er*dT + f* f(T + dT 

+ f*f(T + 2a)e~ a(T+2a) dT + • • • 
= Jq f(T)e~ >T dT + f* f(T .+ a)er^ dT 

+ e~ 2 °* f“ f(T + 2a)e-°T dT + • • ■ 
But f(T + a) = f(T + 2a) = • • • = f(T + na) — • • • = f(T) for all values 


SEC. 7.6 


TRANSFORMS OF PERIODIC FUNCTIONS 


261 


of T, since, by hypothesis, f(t) is of period a. Thus we have 

£{/(*)} = / 0 Vme- sr dT + e--/ 0 V(r)e-^dT + e -^| o °/(T)e- r dr+ • • • 

= (1 + <r« + e- 2 “ + • ■ •) f Q ° f(T)e~^ dT 

Now, if the infinite geometric progression which multiplies the integral is 
explicitly summed, using the familiar formula S = 1/(1 — r), where the com- 
mon ratio r is e~ a \ we obtain the result of the theorem. 

EXAMPLE 1 

Find the transform of the rectangular wave shown in Pig. 7.12. 


FIGURE 7.12 
An alternating 
rectangular 
wave. 


The period here is 2b. Hence by Theorem 1, 

■wwi - rr^ii S^ me ~“ d ‘ 

- rvFs [ r 1 ■ «■" * + r - 1 ■ 

1 1 - 2e~ 6 * + e-«* (1 - e -6 *)* 

s ~ s(l - e~ b, )(l + e~ b ’) 


1 - s 

1 _ e -b> ~ (J»n _ e -bm _ i bs 
s(l + e~ h ’) s(e MS + e~ iali ) s tan ^ 2 


EXAMPLE 2 

Find the transform of the saw-tooth wave shown in Fig. 7.13. 
Here the period is k, and thus 


£{/(«)} 


= - — - — — I* te~ ai dt = — — k (st - 1 ) 1 * 

1 - e~ k ‘ Jo 1 - e~ k ‘ ]_ s- Jo 

1 - (1+ ks)e~ k ‘ (1 + ks) - (1 4- ks)e ~ k * - ks 
s 2 (l - e~ k ‘) ~ s 2 ( 1 - er») 

1 + fcs fc 


FIGURE 7.13 
A saw-tooth 
wave. 


262 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


EXAMPLE 3 

What is the Laplace transform of the staircase function 
f(t) = n + 1 nk < t < (« + l)fc 
shown in Fig. 7.14a? 


FIGURE 7.14 

The “staircase 
function” and its 
synthesis. 



(a) 

The required transform can easily be found by direct calculation. However, it is even simpler 
to obtain it by considering f(t) to be the difference of the two functions shown in Fig. 7.146. 
The transform of the linear function (< + k)/k can be found at once by Formula 4, Sec. 7.3. 
Except for the obvious coefficient 1/k, the transform of the saw-tooth function was obtained 
in the last example. Hence, 


s(l - e~ k ’) 


EXAMPLE 4 

If the Laplace transform of /(<) is l/[(s + a)(l — e - *')], what is /(f)? 

Although £|/(i)l resembles somewhat the transform of the staircase function obtained in 
the last example, the correspondence is not sufficiently close to provide us with the required 
inverse. Moreover, we cannot successfully employ the result of the last example after first using 
the corollary of Theorem 5, Sec. 7.4, for if we replace s by s — a, the given transform becomes 


s[l - 


s[l - e ak e~ k ‘] 


and now, because of the factor e ak , which is not equal to 1 except in the trivial cases a - 0 or 
k = 0, we still do not have the transform of the staircase function. It appears, therefore, that 
we must make a direct attack upon the problem. To do this, let us reverse the derivation of 
Theorem 1 and replace 1/(1 — e~ k ‘) by the infinite geometric series of which it is the sum: 


£ 1 /( 0 } * 


s + a)(l — e - **) s + a 


(1 + e~ k ‘ + e~ 2k “ + e -3 ** + • 


•) 


Now let us assume that we can take the inverse of this infinite series term by term. If we neglect 
the exponential in, say, the (n + l)st term, the inverse of what remains is obvious, namely, 
e~“‘ 

But, having neglected the exponential e~ nk % we must, according to Corollary 2 of Theorem 6, 
Sec. 7.4, translate the function e~ at to the right a distance of nk and then cut it off to the left of 
l — nk. When this is done for each term, we have 

/(0 = e~ at + 6-»«-*'u(( - k) + - 2k) + - 3fc) + • • • 

Taking into account the “cutoff” properties of the various translated step functions, i( is thus 


SEC. 7.6 


TRANSFORMS OF PERIODIC FUNCTIONS 


263 


over the interval (0 ,k) 
over the interval (k,2k) 
over the interval (2k, Zk) 

e~ al + e ak e~ at + e tak e~ at + • • • 4- e nak e~°‘ over the interval Ink, (n + t)k] 

In order to obtain a more convenient expression for f(t) over the general interval 
nk < i < (n + 1 ) A', we can sum the finite geometric progression defining /(t) in this range. 
Since this progression contains n + 1 terms and has the common ratio r ~ e ak , it follows that 
over this interval we have 

(e ak ) n+ 1 - 1 

m - + • ■ • + «“*) - «"«* ~^rZ~T 

e — op- e -al 

— nk < t < (n + 1 )k 

e ak _ 1 e°* — 1 

Now, to achieve a more symmetric form, let us define r = t — (» + !)&• Clearly, t — nk corre- 
sponds to r = —k and l — (n + 1 )k corresponds to r = 0, so that, for each value of n, the 
parameter r ranges from —k to 0 as t ranges from nk to (n + 1 )k. If we make this substitution 
in the first fraction only, f(t) assumes the form 

f(t) = ~r ~ 7 ~ - -k <r <0, nk <t<(n + l)k 

e ak — 1 e ak — 1 




THE LAPLACE TRANSFORMATION 


CHAP. 7 


264 


The second term is a continuous function, dying away rapidly as i increases if a > 0. The first, 
term is completely independent of n, that is, yields the same set of values over each interval, 
because no matter what n may be, as t ranges from nk to ( n + 1)£, r always ranges from —k to 0. 
Moreover, the first term is discontinuous, since at the left end of any interval, where r = — k, 
its value is 

e ttk — 1 

while at the right end, where r = 0, its value is 
1 

e ok — 1 

The periodic function it represents has, therefore, a jump of 



at each of the points t = k, 2k, 3k, .... 

In Fig. 7.15 the discontinuous periodic function represented by the first term in /(f), the 
continuous transient term represented by the second fraction, and /(f) itself are shown for 
a **. 14 and k = 2. 


EXAMPLE 5 

What is the solution of the equation y' + 3y + 2 J* y dl - f(t) if y 0 — 1 and if /(f) is the func- 
tion shown in Fig. 7.16? 

I m 


FIGURE 7.16 
A saw-tooth 
wave. 



Taking the transform of each side of the given equation, using the result of Example 2 to 
transform f(t), we have 


s(l — e~‘) 


s(s + l)(s + 2) (s + l)(s + 2) (1 - e~‘) 

The inverse of the first fraction can be found immediately by the corollary of the first 
Heaviside theorem: 

H - e~ l + 

To find the Inverse of the second fraction we must write 


! . p 1_\ 1 

(s + l)(s + 2)(1 - e-‘) \s + 1 s + 2/1 — e~‘ 

1 1 
(s + 1)(1 — e~‘) (s + 2)(1 — e~‘) 

and then use the results of Example 4, In this ease k ** 1, and thus the inverse over the general 


SEC. 7.6 


TRANSFORMS OF PERIODIC FUNCTIONS 


2&S 


interval n < t < n + 1 is 



The second term is obviously a continuous function of t and is simply an additional contribution 
to the transient of the system. The periodic function defined by the first term is also continuous 
in this case, because the unit jumps exhibited by each of the fractions at t = 1, 2, 3, . . . are 
of opposite sign and, hence, cancel each other. The entire solution for y is therefore 



transient steady-state 


Figure 7.17 shows a plot of the component terms and of y itself. 

The analysis of equations like the one considered in. Exam- 
ple 5 is so important that a table of additional results similar 
to that obtained in Example 4 would be highly desirable. Using 



FIGURE 7.17 
Plot showing the 
solution of 
Example 5. 



266 


THE LAPLACE TRANSFORMATION 


CHAP. 7 



for the most part only the procedure illustrated in Example 4, 
such a table can easily be developed, as we shall now show. 

To eliminate unnecessary writing, it will be convenient to 
introduce the functions defined in Table 7.1 for the interval 
nk < x < (n + 1 )k, where k is an arbitrary positive number, 
n is an arbitrary nonnegative integer, and x is a variable which 
is to be replaced by t or r, as required. The functions 4>i(x,k) and 
are, respectively, the staircase function and the Morse 
dot function. The functions <jn(x,k) and <f>i(x,k) are the integrals 
from 0 to x of <fn(x s k) and fcixfi), respectively. The function 
<f>$(x,a,k) is precisely that which we encountered in the solution of 
Example 4. The others, though somewhat more complicated, 


table 7,1 


Definition of 
functional symbol 

Definition of function over general interval 
al'<i<(i! + 1)/; 

Mx,k) 

n + 1 

i>t(x,k) 

( — l) n 4- 1 

2 

<f>3 (x,k) 

, , ,, n(n + l)k 

(n + l)i 2 

<tn(x,k) 

2 x 4- ~ [1 ( l)“(2n 4- 1)1 

<f>s {x,a,k) 

c -a* 

€ ak Z 1 

4>f,(x,a,k) 

e -az 

e nk + 1 

^ h IA 

C“ s cos b(x 4- k) — £-«<*+*> cos hx 

2(eosh ak — cos bk) 

<tis(x,a,b,k) 

e~ ax cos b(x 4- k) 4- cos bx 

2 (cosh ak 4- cos bk) 

<j><t(x,a,b t k) 

c~ ax sin b(x + k) — <’“«(*+*) sin hx 

2 (cosh ak — cos bk) 


e~ ai sin b(x 4- k) 4- c~ at - x+k ' sin bx 

2 (cosh ak + cos bk) 

<t>u(x s a,k) 

( x 4- k)e~ ax — xe~ a(x+il 

2 (cosh ak — 1) 

4 it(x,a,k) 

(x 4- k)e~ ax 4- xe~“ ix+k> 

2(cosh ak 4- 1) 


:€. 7.6 


TRANSFORMS OF PERIODIC FUNCTIONS 


267 


arise in the same way and can be plotted just as easily when the 
parameters a, b, and k are known. 

Table 7.2 lists the inverses of all elementary periodic-type 
transforms which are likely to be encountered. Of course, as 
Example 5 illustrated, it is usually necessary to employ the 
method of partial fractions before the results of Table 7.2 can 
be applied. 


table 7.2 


Inverse over general interval 
Laplace transform nk < t < (n + l)k 

~k < r < 0 


s(l — e~*') 



1 


s(l + e"* 4 ) 

3. 

1 

s a (l -e-*») 

4. 

1 

s 2 (l + e~ k ’) 


1 a - 0 


(« + a)(l - e~ k >) 

6. 

1 

(s + a)(l + e' h ) a 


s + a 

' 

[(« + a) 2 + b 2 ](1 - e~ ka ) 




4>z(t,k) 




4u(f,k) 


0 5 (r,a,fc) — 4n,(t,a,k) 


( — l) tt 06(r,a,fc) + < fruit, a,k ) 


— 07(f,a,6,A)t 


' [(s + a) 5 + b 2 ](l + e~ k ‘) 


( — l) B 0a (r,a,b,fe) -H 4ie(i,a.h,k) J 


' [(s + a? + b 2 ](l - e~<“) 


4>rj{T,a,b,k) — 4>n(t,a,b t k) | 


' [(s + a) 2 + b 2 ](l + e- k °) 


(s + a) 2 (l - e~ ka ) 


( s -f a) 2 ( 1 + e _fc >) 


(~l) n tpio(.T t a,b,k) -f- 4>\o(t,a,b,k)t 


it* 0 0n(i",a,A:) — 0u(f,o >k) 


a t* 0 (— l) n 4 >u( T } 0 ’>k) + <t>u{t,a,k) 


t The possibility that, simultaneously, a is zero and bk is an even multiple 
of x is to be ruled out. . . , 

J The possibility that, simultaneously, a is zero and bk is an odd multiple 
of x is to be ruled out. 



268 


THE LAPIACE TRANSFORMATION 


CHAP. 7 


Formulas 1 to 4 are obtained by obvious applications of 
Theorem 1 and of Theorem 3, Sec. 7.2. Formula 5 was derived 
in detail in Example 4, and the derivations of Formulas 6 to 10 
follow almost exactly the same pattern. All that is necessary is 
to express as complex exponentials the sines and cosines which 
appear in the inverses of the individual terms. The expression 
for fit) over any interval nk < t < (n + 1 )k is then, as in Exam- 
ple 4, just a finite geometric progression which can be summed 
and converted to a purely real form without difficulty. 

The derivation of Formulas 11 and 12 are somewhat different 
because of the repeated factors in the denominators of the trap- 
forms. Over the general interval nk < t < (n -f- 1 )k, these lead 
to expressions for f(t) which are series of the form 

X (t — jk)e~ aU ~ jk) — te~ at V (e ak ) j — ke~ at X j(e ak y 

jJo j=0 j = 0 

in the case of Formula 11, and 


X (~mt -jk)e-w> = ur« X (-O' - for* f j(-e“*y 

) = 0 } = 0 3=0 

in the case of Formula 12. In each instance, the second series is 
not a geometric progression and must be summed by other 
means. Fortunately, the results of Example 3, Sec. 4.5, are appli- 
cable, and through their use the inverses given in Table 7.2 can 
easily be established. 

The transient, or devaluated, terms in the inverses in Table 
7.2 are all continuous fo t r all t 0. This is true of the periodic, 
or r-evaluated, terms if and only if the degree of the polynomial 
part of the denominator of the transform exceeds the degree 
of the numerator by more than 1. If this is not the case, there is a 
jump of 1 at each of the points t — k, 2k, 37:, . . . , nk, . . . if 
the denominator of the transform contains 1 — e~ k * and a jump 
of (—1)” if the denominator of the transform contains 1 + e~ ks . 


EXAMPLE 6 


A simple series circuit contains the elements R = 400, L = 0.2, C = I0~ 6 . At t ~ 0, while the 
circuit is completely passive, an exponential “saw-tooth” voltage wave, equal to 2?oe“ MOOt 
throughout one period and repeating itself every 0.002 see, is switched into the circuit. Find the 
total current and also the steady-state current which result. 

The differential equation to be solved is 


0.2 ~ + 400i + 10 6 J q 1 i dt = E(t) 


Taking the Laplace transform of both sides, we obtain 
/■ 0.002 

/ ioA In e~ 6 - om e~ u dt 

ii|f} ( 0.2s + 400 + — } = E„ ^ = 

\ s / 1 - e-o.ooj. 


g-<(»+5,000) "jC. 

-(s + 5,000) Jo 


SEC. 7.6 


TRANSFORMS OF PERIODIC FUNCTIONS 


269 


s* + 2,000s -f 5 X 10 6 


52?oc 10 s 


, 1 - e-o.oos.-io 

° (s + 5,000) (1 - g-o-oos.) 

, (1 - c~ 10 ) + e~ ta (l - 
° (s + 5,000) (1 - e -°^) 


s + 5,000 


(s + 5,000) (1 - e ~ Q - wt ') 


(s + 5,000) [(s + 1,000) 2 + (2,000)*] 


(s + 5,000) [(s + 1,000)* -f (2,000) 2 ](1 - «-«•««•) 
Now by simple partial-fraction manipulations we find 


(s + 5,000)[(s + 1,000)* + (2,000)*] 4, 

From this point the entire solution can be written down at once: 
, _ £>Eoe~ v 
1 ~ IfiOO 

5E 0 (1 - e" 10 ) 


i f 1 s + 1,000 

,000 [ s+ 5,000 + (s + 1,000)* + (2,000)* _ 


( — g— 5,0001 g— i,ooo4 cos 2,0004) 

[<l> s(t, 5,000, 0.002) - Ml, 5,000, 0.002)] 

[Mr, 1 , 000 , 2 , 000 , 0 . 002 ) - Ml, 1 , 000 , 2 , 000 , 0 . 002 )] 


4,000 

5Ep(l - e~* a ) 

4,000 


The steady-state current is described by the terms in t: 

i„ = - [Mr, 5,000, 0.002) - Mr, 1,000, 2,000, 0.002)] 

4,000 

or written out at length: 

E 0 (l - e -10 ) [ e -M<w e-i,(iooT C os 2,000(t + 0.002) - e-i.o*>o( f +o.°°*) cos 2,0QQt 

*” 800 [_e 10 — 1 2 (cosh 2 — cos 4) 

This function, plotted for —0.002 < r < 0, defines one complete cycle of the steady-state 
current. Of course, the unit jumps in <£s and $ 7 at the ends of each period just cancel, leaving 
the steady-state current continuous, as, of course, it must be. 

The operational solution of a problem such as this, leading as it does to a relatively simple, 
finite expression for the response, is in general to be preferred to the use of Fourier series, which 
leaves the answer in the form of an infinite series. 


EXERCISES 


1 Using Theorem 1, verify that 


a £{sin bt) = b/{ s* + 6*) b £(cos bl\ = s/(s* + 6*) 

2 Obtain the Laplace transform of the staircase function (Fig. 7.14a) by direct evaluation 
of the Laplace transform integral. 

Find the Laplace transforms of the periodic functions whose definitions over one period are: 


3 /(i) = sin t 0 < t < ir 


e m 


t 

0 


0 <t < a 
a <t <2a 


4 fit) = 

6 m = 


sin t 
0 


1 

0 

-1 

0 


0 < t <T 
T < t < 2tt 

0 < i < a 
a < t < 2a 
2a < i < 3 a 
3a < t < 4a 


no 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


Find the inverse of each of the following transforms: 


(s + l)(s +2)(** + l)(l -e" 2 *) 

9 I 

(s - l)(s + 2)2(1 + e~«) 

Solve the following differential equations and explain your answers, /(() being in each 
ease a periodic function defined over one period as indicated : 

11 y' + 4?/ + 3 J* ydt = /(f) /(f) = { _} ° J < J »o = 1 

12 y" + 4?/' + = /(f) /(f) = { J J < j < j » 8 " ** m 0 

is *“+,-/«> /«>-{; 

14 According to the footnotes to Table 7.2, certain values of a, b, and k cannot be allowed 
to occur simultaneously in Formulas 7, 8, 9, and 10 of Table 7.2 because the formulas 
become meaningless for these values. Why is this? 

15 Derive each of the following formulas of Table 7.2 

a Formula 6 b Formula 7 

d Formula 9 e Formula 10 

g Formula 12 

7.7 

Convolution and the Duhamel formulas 

We shall conclude this chapter by establishing a result concern- 
ing the product of transforms which is of considerable theoretical 
as well as practical interest. 

THEOREM 1 

£{mMg(t)} - £ - X)ff(X) dx} 

= £ { /,*/(%(* ~ X) dxj 

PROOF Working with the term on the right in the first equality, we have 
by definition 

£ U‘/(i - a} = // - x)»(x) Ift] 

I 1 X < * 

“ (< - x) = |o XX 


( 1 ) 

Now 
and thus 


c Formula 8 
f Formula 11 


s(s* + 2s +5)(1 +e~*) 

3s + 5 

(s +l)(s s +4s + 5)(1 - e" 3 *) 


SEC. 7.7 


CONVOLUTION AND THE DUHAMEL FORMULAS 


271 


Since this product vanishes for all values of X greater than t, the inner integration 
in (1) can be extended to infinity if the factor u(t — X) is inserted in the integrand. 
Hence, 

(2) £ { /> - X) ff (X) d\} = [/ 0 * fit - \)g(\)u(t - X) dx] dt 


Now our usual assumptions about the functions we transform are sufficient to 
permit the order of integration in (2) to be interchanged: 

(3) £ { pit - X)p(X) dh} = jf" - \)g(\)u(t - X)e-“d*] dX 

= g(V [ fit - X)«(( ~ X)e~ ,e di\ d\ 

Because of the presence of uit — X), the integrand of the inner integral is identi- 
cally zero for all t < X. Hence, the inner integration effectively starts not at 
t = 0, but at t — X. Therefore 

(4) £ { f‘ fit - X)< 7 (X ) dx} = f* gi\) [ f” fit - X)e-‘ dt] d\ 

Now, in the inner integral on the right of (4), let t — X = r and dt — dr. 
Then £ { j[‘ fit - \)g(\) d \ } - / Q “ g(\) [ jf“ dr] d\ 

= f Q ff(X)e~ sX [f” f(T)e'* T dr] d\ 

= [ fo° f^ e ~ aT dT ] [Jo* dx] 

= £{/(() }£{gr(£)} as asserted. 

From symmetry, the second form of the theorem can be obtained by interchanging 
fit) and git). 

The convolution, or Faltung,* integral 

pit - X)J7(X) dX 

is frequently denoted simply by f(t)*git). In this symbolism 
Theorem 1 becomes 
£{/}£{<?} =£{/*<?} = £{<7*/l 
EXAMPLE 1 

If £\f(t ) } = l/(s* + 4,s + 13) 2 , what is /(e)? 

Clearly, we can write £{/(<)} in the form 

l/[(s + 2) 2 + 3 s ] 2 


and then use the corollary of the first shifting theorem (Theorem 5, Sec. 7.4) to obtain 

( 1 1 t ' 1 

(5> 

Now — 


I (s 2 + 3 2 ) 2 1 


sin 3f| 


[sin 3f] 

“J 


l”: 


German for folding. 


272 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


Hence, by the convolution theorem, 


£ -1 


= - [ l sin 3(1 — X) sin 3X d\ 

9 Jo 

_ 1 n cos (6X — 3 1) — cos 31 ^ 


1 

18 

1 

18 


f sin (6X - 30 


( sin 31 \ 

— -‘“ s3 7 


Therefore, from (5), 


m 


e -!!i (sin St — 31 eos Si) 
_______ 


This example illustrates how in certain cases the convolution theorem can be used in place 
of a fourth Heaviside theorem to handle repeated quadratic factors in the denominator of a 
transform. 


EXAMPLE 2 

Find a particular integral of the differential equation 


y" + 2 ay' + (a* -f 6% = /(i) 

Taking the Laplace transform of the given equation, assuming ?/r> = y[ — 0, since we desire, 
only a particular solution, we find 


£i y\ - 


(s + a)’ + 6* 


£{/«)} 


Now 


1 


(s + a)* + b* \ 
Hence £{y\ « £{/(!)[ £ j- 

and thus, by the convolution theorem, 


V = ^ £ fit ~ X)e-° x sin 6X dX 

or, equally well, 

y - ~ p sin 6(1 — X) dX - P /(X)e« x sin 6(1 — X) dX 

o ;o 6 yo 

It is interesting to compare this procedure with the method of variation of parameters 
(Sec. 2.4) for the determination of particular integrals of linear differential equations. The two 
give identical results in the case of constant-coefficient linear differential equations. 


An especially important application of the convolution 
theorem makes it possible to determine the response of a system 
to a general excitation if its response to a unit step function is 
known. To develop this idea we shall need the concepts of transfer 
function and indicial admittance. 

Any physical system capable of responding to an excitation 
can be thought of as a device by means of which an input function 
is transformed into an output function. If we assume that all 



SEC. 7.7 


CONVOLUTION AND THE DUHAMEL FORMULAS 


273 


( 6 ) 


( 7 ) 


initial conditions are zero at the moment when a single excita- 
tion, or input, f(t) begins to act, then, by setting up the differential 
equations describing the system, taking Laplace transforms, 
and solving for the transform of the output y(t), we obtain a 
relation of the form 




£{/(*)} 

Z(s) 


where Z(s) is a function of s whose coefficients depend solely on 
the parameters of the system. Moreover, in the usual applications 
to linear systems, Z(s) will be just the quotient of two poly- 
nomials in s. 

In electrical problems where the input is an applied voltage 
Eae M and the output is the resultant current, the function Z(s), 
except for the fact that the frequency variable ju is replaced by 
the Laplace transform parameter s, is just the impedance of the 
network. However, the importance of Z(s) is not restricted to 
electrical circuits, and for systems of all sorts the function 


1 _ £{y(Q} _ £{ output] 

Z(s) ~ £{/(/)} ~ £{ input} 


is an exceedingly important quantity, usually called the transfer 
function. In particular, after s has been replaced by ju, the 
transfer function can be used to determine the effect of any sys- 
tem on the phase and amplitude of a sinusoidal input of arbitrary 
frequency, just as in the electrical case. 

If a unit step function is applied to a system with transfer 
function l/Z(s), then from (6) we have 


£\y(t)} = 


£\u(t)} _ __I_ 
Z(s) ~ sZ(s) 


The response in this particular case is called the indicial admit- 
tance A ( t ) ; that is, 

£U(0l-^5 


Using (7) we can now rewrite (6) in the form 

Hence, by the convolution theorem, 

£{y{t)} « { £ Ait - X)/(X) d\) = s£ {f* A(X)/(* - X) dx} 

But from Theorem 3, Sec. 7.4, it follows that 

- £ [jO - *] - £[£ a *w - x > ix ] 


274 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


( 8 ) 

(9) 


( 10 ) 

(ID 

( 12 ) 


FIGURE 7.18 
Plot showing the 
synthesis of a 
general function 
by means of step 
functions. 


Therefore, performing the indicated differentiations,* we have 
equivalently 

yif) = j[* A'(t - \)/(X) dX + A (P)f(t) 
and 

yd) * f'+mnt ~\)d\ + a (0/(0) 

Since .4(f) is by definition the response of a system which is 
initially passive, it follows that .4(0) = 0. Hence, Eq. (8) becomes 
simply 

y(t) = ft A' (t - X)/(X) dX 

Finally, by making the change of variable r = i — X in the 
integrals in (9) and (10), we obtain the related expressions 

y(0 “ d'Crl/tf - r) dr 

= »4(f)/(0) + f* Ait - t ) f (r ) dr 

Formulas (9) to (12) all serve to express the response of a 
system to a general driving function /(f) in terms of the experi- 
mentally accessible response to a unit step function. They are 
often referred to collectively as Duhamel’s formulas, after the 
French mathematician J. M. C. Duhamel (1797-1872). 

It is possible to interpret these integrals in physical terms 
as follows: Let the driving function /(f) be given, and imagine 
it approximated by a series of step functions, as shown in Fig. 
7.18. The first step function is of noninfinitesimal magnitude 



* According to Leibnitz’s rule, if Fit) ~ <t>{x,t) dx, where a and b ar 

Ja(t) 

differentiable functions of t and where and —(^-2 are continuous ii 

‘ . at 

x and t, then 

dF _ rm dbit) . ... dad) 

Tt " Ui) ~~aT dx + * m)A ~dT " * [a(iM1 ~AT 


:C. 7.7 


CONVOLUTION AND THE DUHAMEL FORMULAS 


27 S 


/( 0). All later step functions in the approximation are of infini- 
tesimal magnitude, and their contributions in the limit will have 
to be taken into account by integration. Specifically, since 

A/^4fi 

AX dt It 


= /'(X) 


we have for the height A/,- of the general infinitesimal step function 
the approximate expression 

A fi = /'(X,-) AX, 

Now if A{t) is the indicial admittance of the system, the first 
step function f(0)u(t) produces a response equal to 

mA(t) 

from the very definition of the indicial admittance as the response 
per unit excitation. For the second step function A/i u(t — X x ), 
there is a lag of t — X x units of time before it begins to act. Hence 
the infinitesimal response it produces is 

A/iA (/ - X0 or /'(Xi) AX i A (t - X0 
Similarly, the third step function produces the response 
f'(X 2 ) AX 2 A {t - X s ) 

and in general the (i + l)st step function produces the response 
/'(X,) AX,- A (t - XO 

If these contributions to the total response are added, we obtain 
for the response at a general time t 

y(t) = /(0)A(0 + /'(X i) AXi A(< - Xi) + /'(X 2 ) AX 2 A(« - X,) + 

• • • +/'(Xi) AX, A (t ~ Xi) + • • • 
= /(0)A(0 + S/'(Xi)A(f - Xi) AXi 

the summation extending over all the step functions which have 
begun to act up to the instant t. In the limit when AXi approaches 
zero and the height of each step function after the first, j(Q)u(t), 
approaches zero, the sum in the last expression becomes an 
integral, and, except for the dummy variable, we have Eq. (12). 

To give a physical interpretation of Eq. (10), we must first 
determine the significance of the derivative of the indicial admit- 
tance, A'(t). To do this, we shall need the concept of a unit 
impulse. 

Suppose that we have the function shown in Fig. 7.19. 
This consists of a suddenly applied excitation of constant mag- 
nitude acting for a certain period of time and then suddenly 
ceasing, the product of duration and magnitude being unity. 
If a is very small, the period of application is correspondingly 
small but the magnitude of the excitation is very great. It is 
sometimes convenient to pursue this idea to the limit and imagine 
a forcing function of arbitrarily large magnitude acting for 
an infinitesimal time, the product of duration and intensity 


276 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


FIGURE 7.19 

Plot suggesting 
the nature of a 
unit impulse. 


(13a) 

(136) 



-~u(t-a ) 


remaining unity as a — * 0. The resulting “function” is usually 
referred to as the unit impulse I(t) or the 8 function 3(i!).f 

In somewhat different terms, the 6 function 8(t — fo) is often 
described by the following purported definition : 


/_„ ^ — t 0 ) dt — 1 


t 5* to 
t = to 


Taken literally this is nonsense, for the area under a curve which 
coincides with the f-axis at every point but one must surely be 
zero and not unity, as (136) asserts. However if (13) is considered 
to be merely suggestive of the limiting process by which we first 
described the unit impulse, then, whatever its shortcomings as a 
definition, it is at least as meaningful as certain other useful and 
reasonably “respectable” concepts in applied mathematics. 

Consider, for instance, the familiar concept of a concen- 
trated load on a beam (Fig. 7.20a). Clearly, such a load is physi- 
cally unrealizable and must be viewed as an idealization of the 
following nature: Imagine that over the interval (xo, %o + a) the 
beam bears a distributed load whose magnitude per unit length 
is P/a (Fig. 7.20 b). Then no matter how small a may be, the 
total load on the beam, being equal to the product of the intensity 
P/a and the interval length a, is just P, As a — t 0, the ideal con- 
cept of a concentrated load thus emerges as the limiting form 
of a realizable distributed load. If one w r ere now asked to describe 


t More specifically, 8(f) is often called the Dirac 6 function, after the British 
theoretical physicist P. A. M. Dirac (1902- ). 


SEC. 7.7 


CONVOLUTION AND THE DUHAMEL FORMULAS 


277 


FIGURE 7.20 

Plot suggesting ^ 
the interpreta- 
tion of a 
concentrated 
load on a beam 
as a unit impulse. 

0 b ) 0 

the load per unit length w(x) in the limiting case, one would 
probably give the following “definition” : 

N . ( 0 X 7* Xo 

< 14a) - { . X = X. 

(146) w(x) dx = P 

which corresponds in all essential respects to the description of 
the 5 function provided by (13). 

One interesting and important property of the S function is 
its ability to isolate or reproduce a particular value of a function 
f(t) according to the following formula: 

(15) /_“» f(t)Kt - to) dt = m 

To justify this we revert to the prelimiting approximation to the 
6 function and use it in place of §(f — U) in (15). This gives us 
the approximating integral 

Now, by the law of the mean for integrals* this integral is equal to 

(16) a j = /({) /» < $ < f« + a 

Now as a— » 0, perforce £ — > tn, and so, from (16), the integral 
approaches /(f 0 ), as asserted. -- 

The unit impulse is only the first of an infinite sequence of 
so-called singularity functions. As a direct generalization of the 
unit impulse we have the unit doublet (Fig. 7.21), defined 
(loosely) as 

li m ~ 2 u (t — a) + u(t — 2a) 
a— >o a 2 


* This asserts that, if f(i) is continuous over the closed range of integration 
a i g b, then there exists at least one value of t, say t — £, between a and b 
such that 

£m « - (6 - a)m 


Load per unit length = — 




278 


THE LAPLACE TRANSFORMATION 


CHAP. 7 


FIGURE 7.21 ~ [u(i) + [«(«“ 2a)] 

Plot suggesting j ' 

the nature of a j ; 

unit doublet. | I 




the unit triplet, defined similarly as 

jj m ~ 3u(t — a), -r 3 u(t — 2a) — u(t — 3a) 

a— >o a 3 

and so on, indefinitely. Some of the properties of these "func- 
tions” will be found among the exercises at the end of this section. 

It is interesting and important that in many applications 
the use of the 5 function can be rigorously justified by arguments 
based on what is known as the Stieltjes integral* a generalization 
of the familiar Riemann integral. More generally, the singularity 
functions are examples of mathematical objects known as gen- 
eralized functions , or distributions, which are studied in the 
recently developed theory of distributions . f 

To determine the Laplace transform of a unit impulse, we 
return to the prelimiting approximation 
u(t ) — u(t — a) 
a 


shown in Fig. 7.19. Transforming this expression, we have, for 
all a > 0, 

1 /I _ = 1 - e~ as 

a \s s / as 

As a — » 0, this transform assumes the indeterminate form 0/0, 
but, evaluating it in the usual way by L’Hospital’s rule, we obtain 


* Named for the Dutch mathematician T. J, Stieltjes (1856-1894). 
t An introductory account of the theory of distributions can be found in 
Athanasios Papoulis, “The Fourier Integral and Its Applications,” pp. 269- 
282, McGraw-Hill Book Company, New York, 1962. 


SEC. 7.7 


CONVOLUTION AND THE DUHAMEL FORMULAS 


279 


FIGURE 7.22 
Plot showing the 
synthesis of a 
general function 
by means of 
impulses. 


immediately the limiting value 1. In the same way we can show 
that the transforms of the unit doublet and the unit triplet are, 
respectively, s and s 2 , and the transforms of the other singularity 
functions follow exactly the same pattern. Since these transforms 
do not approach zero as s becomes infinite, we know from Corol- 
lary 1 of Theorem 5, Sec. 7.1, that they are not the transforms 
of piecewise regular functions of exponential order. This, of 
course, is obvious, for although the singularity functions are all 
of exponential order, they are limiting forms involving unbounded 
behavior in the neighborhood of the origin and hence are not 
piecewise regular. 

We are now in a position to resume our attempt to give a 
physical interpretation to Formula (10). For convenience let us 
denote by hit) the response of the system under discussion when 
the driving function is a unit impulse. We have already seen [Eq. 
(6) ] that 


£[y(t)} = 


*\m 

Z(s) 


Hence, if /(f) is a unit impulse, so that £\J(t ) } == 1 and y(t) = h(t) , 
we have 




Thus, from Theorem 3, Sec. 7.4, it follows that 


MO - ^ - A'(<) 


or, in words, the response of a system to a unit impulse is the deriva- 
tive of the response of the system to a unit step function. 

Now let f(t), in the general case, be approximated by a series 
of infinitesimal impulses, as shown in Fig. 7.22. For the first 



THE LAPLACE TRANSFORMATION 


impulse, whose magnitude by definition is the product 
/( 0) AA„ a /(Xo) AAo 

the infinitesimal response is [/(X 0 ) A\o]A'(f), since A' it) s /;(() is 
the response per unit impulse. The secoud impulse does not occur 
untilt == Ai; hence the response it produces is [/(Ad AAi]A'(( — Ai), 
and in general, the response produced by the (?" + l)st impulse is 

[/(A,) AXJ A'(t - A,) 

If these contributions to the total response are added, we obtain 
for the response at a general time t 

y(t) = S/(A i)A'(t - Ad AA,- 

the summation extending over all impulses which have acted on 
the system up to the time t. In the limit when each AA — > 0, the 
last sum becomes an integral, and we have Formula (10). 

EXERCISES 

Find the inverse of each of the following transforms : 


.9 + 2 

s i + 4s + 4 


(s- + 9) 8 

s'* + 2s + 3 
_____ 

s 4 + 2,s 2 — s 

(.9 + 1)V+ lj* 


(** + 4s + IS) 2 

7 Using the convolution formula, find a particular integral of the equation 

V” + %ay' + ahj - /(<) 

8 Using the convolution formula, find a particular integral of the equation 

y" + (o + b)y' + aby - f{t) 

9 Verify that the Laplace transform of the unit doublet is s and that the Laplace transform 
of the unit triplet is s s . 

10 If Bit) denotes the unit doublet function, show that 


J^jvm - to) <& - -f«o) 

11 Find .4 (t) and hit) for the equation y” + 3 y’ + 2y = 0, verify that h(t) - .4 '(f), and then 
verify Formulas (10) and (12) when this equation is ■“driven” by the function /(f) = e‘. 

12 a Find .4(f) and hit) for the system shown in Fig. 7.23 if the input is applied to mi and 
if the output is the response, i.e., displacement, of m 2 . Verify that h(t) = A '(f) . What is the 


ki - 1 nil = 1 




— 



w 




response of m 2 to an arbitrary force /(f) applied to mi when the system is at rest in its 
equilibrium position? 

b Work part a if the input is applied to m* and the output is the response of m x . 


SEC. 7J 


CONVOLUTION AND THE DUHAMEL FORMULAS 


281 


13 Find .4 (f) and h(t) for the system shown in Fig. 7.24 if the input is applied across the indi- 
cated terminals and if the output is the current through Ji’ 2 . Verify that h(i) — .4 '(f). What 
is the current through R t due to an arbitrary voltage E(t) applied across the terminals 
when all charges and currents in the system are zero? 


C 1 = 0.5xl0~ 6 e» = i{T 6 



14 Show that the solution of the equation ay" + by' + cy = 0 (ij a ~ 0, y' 0 = 1) is exactly 

the same as the solution of the equation ay" + by' + cy = a 5(f) (y 0 - y' 0 - 0). Does this 

fact have a physical interpretation? With what combination of singularity functions must 
an initially passive, second-order equation be driven in order to have the same solution 
as the undriven equation with initial conditions y - y<>, y' = y' 0 ? 

15 Show that f(t)*{g(t) *7i(f)] = f* fit — h)g{\ — ti)h(ji) dp d\. 

16 Show that f(t) *{g(t) *A(<)] = [/(<) *g(t)]*h(t) and that 

fit)*[g(t) ± h(t)] = [/(0*fir(01 ± [f{t)*h(t) 1 

17 Show that &\f{t) }£\g{t) \&\h{t) \ = £ \f{t) *g{t) 

18 Show that 1*1 = t and that 1*1*1 = f*/2. What is the generalization of these results to 
n factors? 

19 Evaluate (a) 6(t — a)*f(t), (b) u (t — a)*f(t), (c) t m *t n if m and n are nonnegative integers. 

20 If /(0) = g( 0) = 0, show that /'(<)*# (f) = f(l)*g'(t) and that 

+ fit) *g'(t) 

2 


im *git)Y 


CHAPTER EIGHT 


Partial 

Differential 

Equations 

8.1 

Introduction In our previous work we have seen how the analysis of mechanical 

and electrical systems containing lumped parameters often leads 
to ordinary differential equations. However, assumptions to the 
effect that all masses exist as mass points, that all springs are 
weightless, or that the elements of an electrical circuit are concen- 
trated in ideal resistances, inductances, and capacitances rather 
than continuously distributed are frequently not sufficiently 
accurate. In such cases a more realistic approach usually leads to 
one or more partial differential equations which must be solved 
to obtain a description of the behavior of the system. In this 
chapter we shall discuss such equations as they commonly arise 
in engineering and in physics. We shall begin our study by exam- 
ining in detail the derivation from physical principles of certain 
typical partial differential equations. Then, knowing the forms 
of most frequent occurrence, we shall investigate methods of 
solution and their application to specific problems. 


8.2 

The derivation of equations 

One of the first problems to be attacked through the use of partial 
differential equations was that of the vibration of a stretched, 
flexible string. Today, after nearly 250 years, it is still an excellent 
initial example. 

Let us consider, then, an elastic string, stretched under a 
tension T between two points on the x-axis (Fig. 8.1a). The 
weight of the string per unit length after it is stretched we sup- 
pose to be a known function w(x). Besides the elastic and inertia 
forces inherent in the system, the string may also be acted upon 


SEC. 8.2 


THE DERIVATION OF EQUATIONS 


283 


by a distributed load whose magnitude per unit length we assume 
to be a known function of z, y, t, and the transverse velocity y, say 
f(x,y,y,t). In formulating the problem we assume that 

a The motion takes place entirely in one plane, and in this plane 
each particle moves at right angles to the equilibrium position 
of the string. 

b The deflection of the string during the motion is so small that 
the resulting change in length of the string has no effect on 
the tension T. 

c The string is perfectly flexible, i.e., can transmit force only 
in the direction of its length. 

d The slope of the deflection curve of the string is at all points 
and at all times so small that with satisfactory accuracy sin a 
can be replaced by tan a, where a is the inclination angle of 
the tangent to the deflection curve. 


Gravitational and frictional forces, if any, we suppose to be taken 
into account in the expression for the load per unit length 
f(x,y,y,t). 



(a) (b) 

FIGURE 8.1 

A typical element of a vibrating string. 


With these assumptions in mind, let us consider a general 
infinitesimal segment of the string as a free body (Fig. 8.16). 
By assumption a, the mass of such an element is Am — w(x) Ax j g. 
By assumption b, the forces which act at the ends of the element 
are the same, namely, T. By assumption c, these forces are 
directed along the respective tangents to the deflection curve; 
and, by assumption d, their transverse components are 

T sin a» = T sin a , = T tan at, 

and T sin a x — T sin a ^ = T tan a ^ 

The acceleration produced in Am by these forces and by the 
portion of the distributed load f(x,y,y,t ) Ax which acts over the 
3 2 w 

interval Ax is approximately where y is the ordinate of an 

arbitrary point of the element. The time derivative is here 
written as a partial derivative because obviously y depends not 


284 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


only upon t but upon x as well. Applying Newton’s law to the 
element, we can thus write 

w &). - f tan a I — T tan a j + f(x,y,y,f) Ax 
g at* li+iisc I® 

or, dividing by Ax, 

The fraction on the right-hand side consists of the difference 
between tan a at .r -f Ax and at x, divided by the difference Ax. 
In other words, it is precisely the difference quotient for the 
function tan a. Hence its limit as Ax — >0 is the derivative of 

tan a with respect to x, that is, - But;, since tan a = 

this can be written simply as — Our final result, then, is that 

the deflection y(x,t) of a stretched string satisfies the partial 
differential equation* 

dt- w(x) dx- ' ' w(x) 3 ^’ 

In most important applications the weight of the string per unit 
length w(x) Is a constant, and there are no external forces; i.e., 
f(x,y,y,t) is identically zero. When this is the case, Eq. (2) reduces 

to the one-dimensional wave equation 

df 2 dx 2 w 

The dimensions of a 2 are 
Force X acceleration _ (ML/T 2 ) (L/T 2 ) Lr 
Weight/unit length {ML/T 2 )(l/L) T 2 

that is, a has the dimensions of velocity. The significance of this 
will become apparent in See. 8.3 when we discuss the D’Alembert 
solution of the wave equation. 

Closely related to the vibrating string is the vibrating mem- 
brane. To obtain the partial differential equation describing its 
behavior, we suppose Chat it is stretched across some closed 
curve C in the x,i/-plane and that when it vibrates each particle 
moves in a direction perpendicular to the x,y-plane. We assume, 
further, that the weight per unit area of the membrane after it is 


* The question of what constitutes a satisfactory derivation of the partial 
differential equation describing a given physical system is not a simple one. 
To attempt to give a careful limiting argument is, in effect , “to strain at a 
gnat and swallow a camel,” since, being ultimately atomic, no physical 
system is continuous. Perhaps our purported derivations should be regarded 
merely as plausibility arguments suggesting that certain partial differential 
equations be accepted as the axioms of a theoretical or “rational” study of 
applied mathematics, whose practical importance, in contrast to its purely 
mathematical interest, is to be judged by how well its conclusions describe 
past observations and predict new ones. 


SEC. 8.2 


THE DERIVATION OF EQUATIONS 


285 


FIGURE 8.2 
A typical element 
of a vibrating 
membrane. 


(4) 


(5) 


FIGURE 8.3 
A typical element 
of a vibrating 
shaft. 



stretched is a known function w(x,y) and that the tension per 
unit length is the same at all points and in all directions. Finally, 
we suppose that the membrane is acted upon by a known dis- 
tributed force whose magnitude per unit area is f(x,y,z,z,t). Then 
by computing the transverse, or ^-components, of the tensile forces 
acting across the boundaries of a typical two-dimensional element 
of the membrane (Fig. 8.2) and applying Newton’s law to the 
mass of such an element, we find without difficulty that the 
deflection of the membrane z{x,y,t ) satisfies the equation 


d 2 z 

dP 


Tg (dh dh\ 

w(x,y) \&e 2 " r dy 2 / 




If the membrane is uniform and if there are no external 
forces, i.e., if f(x,y,z,z,t ) = 0, then Eq. (4) reduces to the two- 
dimensional wave equation 



Here, as in the case of the vibrating string, the parameter a has 
the dimensions of velocity. 

As a third problem leading to a partial differential equation, 
let us consider a shaft vibrating torsionally (Fig. 8.3a). The 
material of the shaft we assume to have a modulus of elasticity 
in shear E s and to be of uniform weight per unit volume p. The 
cross-section area of the shaft at a distance x from one end we 
suppose to be a known function, say A(x). The polar moment 
of inertia J(x) of a general cross section about its center of gravity 




PARTIAL DIFFERENTIAL EQUATIONS 


CHAP, 8 


( 6 ) 


we also suppose known. In addition to the obvious elastic and 
inertia torques, the shaft may also be acted upon by a distributed 
torque whose magnitude per unit length is a known function, 
say f(x,6,8,t), where 8 is the angle through which a general cross 
section has rotated from its equilibrium position and 6 is the 
angular velocity with which that cross section rotates while the 
shaft is vibrating. We assume further that 

a All cross sections of the shaft remain plane during rotation, 
b Each cross section rotates about its center of gravity, 
c The shape of a general cross section does not depart greatly 
from a circle. 


Frictional torques, if any, we suppose to be taken into account 
in the expression for the distributed torque per unit length, 
f(xAe,t). 

We begin by considering as a free body an infinitesimal seg- 
ment of the shaft bounded by two cross sections a distance Ax 
apart (Fig. 8.3 b). The mass of such a disk is approximately 

Am - 

9 


and its radius of gyration is 



Hence, its polar moment of inertia is approximately 

m = v-Am = M.' AW* * m -Mate 

A(x) g g 


The rotation of such an element is produced by the portion 
of the distributed torque f(x,6,6 ; t ) Ax which acts on it and by 
the torque T, transmitted to it through the end sections by the 
adjacent portions of the shaft. Therefore, applying Newton’s 
law in torsional form, we have 


• T + f(x,6,d,t) Ax 


J(x)p Ax d*6 = f \ 

g dt 2 jx+Ax 

or, dividing by Ax and then letting Ax —> 0, 


g dtr 


Now, from strength of materials, we recall that the torque 
transmitted through any cross section of a twisted shaft is pro- 
portional to the twist per unit length, i.e„ the slope of the (6,z)- 
eurve at that cross section : 


The proportionality constant k is known as the torsional rigidity. 
For shafts which are solids of revolution it can be shown that 
k = EJ(x) 


SEC. 8.2 


THE DERIVATION OF EQUATIONS 


(7) 


( 8 ) 


28 7 


and this result can be used with satisfactory accuracy whenever 
the cross sections of a shaft are approximately circular. Hence, 
in such cases Eq. (6) becomes 


J(x)p d 2 0 
g dt 2 





How r ever, for configurations whose cross sections differ appreci- 
ably from circles, it is necessary to determine the torsional rigidity 
k by experimental means and continue the solution of Eq. (6) 
by numerical rather than analytical methods. 

In most elementary applications the shafts are of uniform 
circular cross section and there are no external, distributed tor- 
ques. In such cases J(x) is a constant, f(x, Q,&,t) is identically zero, 
and Eq. (7) therefore reduces to 


d*8 ^ 2 d*e 
dt 2 a dx 2 



p 


which is again just the one-dimensional wave equation. 

Another vibration problem of considerable practical interest 
concerns the transverse vibrations of a beam. To obtain the partial 
differential equation describing these vibrations, let us first choose 
a coordinate system such that the beam in its undeflected position 
coincides with a portion of the r-axis and the deflections occur in 
the direction of the y-axis. A general cross section of the beam we 
assume to be of known area A(x) and known moment of inertia 
I(x) about its neutral axis. The material of the beam we suppose to 
be of weight per unit volume p and modulus of elasticity E. In 
addition to the intrinsic elastic and inertia forces, the beam may 
also be acted upon by a distributed load of known intensity 
f(x,y,y,i). Gravitational and frictional forces, if any, we suppose 
included in this distributed load. Finally, we assume that all 
particles of the beam move in a purely transverse direction, i.e., 
that the slight rotation of the cross sections as the beam vibrates 
is negligible. 

Now from the discussion in Sec. 2.6 we recall the following 
formulas of beam flexure: 




33 * -*» 


dV(x) 

dx 


—w(x) 


where M (x) = bending moment at a general cross section 

F(:r) = shear, or net transverse force, to the right of a gen- 
eral cross section 

w(x) — load per unit length at a general cross section 
Hence, combining these relations into a single equation, we have 


u>(x) = 


dV(x) d 2 M(x) = _ d z ^EI(x) 

dx dx 2 dx 2 


(9) 


283 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


( 10 ) 


( 11 ) 


where the derivatives are now written as partial derivatives, since 
in our problem y depends upon t as well as upon x. 

During vibration the load per unit length on the beam consists 
of two parts: the external load f(x,y,y,t) and the inertia load due 
to the motion of the beam itself. Now the mass of an infinitesimal 
segment of the beam of length Ax is approximately [pA (x) Ax ]/g, 

and the transverse acceleration of such a mass element is 

01 “ 

Hence the inertia load per unit length is 

pA{x) Ax d 2 y 

Q dP = P A(x) djyj 

Ax g Bl- 

and, therefore, the total load per unit length is 
w(x) = ~ A(x') 

Substituting this into Eq. (9), we have finally 

\ d JK. 


8 >[«/(*)■ 0] 

“3? “ “ g A( - X) ¥ 

In many important applications the beam under considera- 
tion is of constant cross section and there is no external load ; that 
is, A and I are constants and f(x,y,y,t) ss 0. Under these condi- 
tions Eq. (10) reduces to the simpler form 

a 2 = ® 
dx i dt 2 Ap 

In this case the parameter a does not have the dimensions of 
velocity. 

An entirely different class of problems leading to partial differ- 
ential equations is encountered in the study of the Aotv of heat in 
conducting regions. To obtain the equation governing this phe- 
nomenon we must make use of the following experimental facts: 

a Heat flows in the direction of decreasing temperature, 
b The rate at which heat flows through an area is proportional to 
the area and to the temperature gradient normal to the area, 
c The quantity of heat gained or lost by a body when its tem- 
perature changes is proportional to the mass of the body and to 
the temperature change. 


t The sign of the inertia load per unit length can be checked by observing 
that, when the beam is instantaneously concave toward the positive y-axia, 
its elements are either losing velocity in the negative ^-direction or gaining 
velocity in the positive y-direction and, hence, have positive acceleration. 
Therefore the inertia load per unit length is positive, as required by the 
convention we established in Sec. 2.6 (Fip. 2.2). Similarly, when the beam 
is instantaneously convex toward the positive y-axis, the acceleration of its 
particles is negative, and so, too* is the inertia load per unit length. 


SEC. 8.2 


THE DERIVATION OF EQUATIONS 


289 


( 12 ) 


(13) 


The proportionality constant in b is called the thermal con- 
ductivity of the material k. The proportionality constant in c is 
called the specific heat c. 

Let us now consider the thermal conditions in an infinitesimal 
element of a conducting solid (Fig. 8.4). If the weight of the con- 
ducting material per unit volume is p, the mass of such an element 
is 

. p Ax Ay A z 

Am — - — 

9 


Then, if A u is the temperature change which occurs in the interval 
At, the quantity of heat stored in the element in this time is, by c, 

* rr a a cp Ax Ay A z A u 

AH — c Am A u - — 


and the rate at which heat is being stored is approximately 

AH cp ... Au 
— — — sb — Ax Ay A z — 

At g At 


The heat which produces the temperature change Aw comes 
from two sources. In the first place, heat may be generated through- 
out the body, by electrical or chemical means for instance, at a 
known rate per unit volume, say f(x,y,z,t). The rate at which heat 
is being received by the element from this source is, then, 


f(x,y,z,t ) Ax Ay A z 


In the second place, the element may also gain heat by virtue of 
heat transfer through its various faces. 

In particular, the rate at which heat flows into the element 
through the rear face EFGH is, by b, approximately 


-kAyAz-l 


where, as an average figure, we have used the temperature 
gradient du/dx at the mid-point of the face (x, y -f Ay, 
z + A z). The minus sign is necessary because the element gains 
heat through the rear face if the normal temperature gradient, 



FIGURE 8.4 
A typical volume 
element in a 
region of three- 
dimensional heat 
flow. 


290 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP, i 


(14) 


i.e., the rate of change of temperature in the ^-direction, is nega- 
tive. Similarly the element gains heat through the front face at the 
approximate rate 


k Ay Az 


du 

dx 


X+Ax 

m 


Ay 

Az 


The sum of these two expressions is the net rate at which the ele- 
ment is gaining heat because of heat flow in the .r-direction. 

In the same way we find that the rates at which the element 
gains heat because of flow in the y- and ^-directions are, 
respectively, 


. du 
—kAxAz-r- 
dy 


and —k Ax Ay - 


»+H *. + *** A** 

\ V z+H Az 

\ x +max + 1cAx A y~, 

t l+H Ay 1 


Ax 

3/+Ay 

U+M A* 


WHS 

:+Az 


Now the rate at which heat is being stored in the element (12) 
must equal the rate at which heat is being produced in the element 
(13) plus the rate at which heat is flowing into the element from 
the rest of the region. Hence we have the approximate relation 


— AxAyAz — = f(x,y,z,t) Ax Ay Az 


+ k Ay Az 


x+Ax 

“.tut 


+ kAxAz (fy 


+ k Ax Ay 


+ H AX Qy 

HyzAz 

_ ^ 
turn * 

!z-f Az 


!W ' 

x+H 

'+v* 

its* 




Finally, dividing by k Ax Ay Az and letting Ax, Ay, Az, and At 
approach zero, we obtain the equation of heat conduction 


2 du d 2 u . d*u . d 2 u . l,, 

' at dx 2 + dy 2 + dz 2 + k^ X,y,Z,i) 


Op 

kg 


The parameter a in this equation does not have the dimensions of 
velocity. 

In many important cases, heat is neither generated nor lost 
in the body, and we are interested only in the limiting, steady- 
state temperature distribution when all change of temperature 
with time has ceased. Under these conditions both f(x,y,z, t) and 
du/ dt are identically zero, and Eq. (14) becomes simply 

d/u d/u d/u /' 

dx 2 dxf dz 2 


( 15 ) 


SEC. 8.2 


THE DERIVATION OF EQUATIONS 


291 


(16) 


(17) 


(18) 


This exceedingly important equation, which arises in many 
applications besides steady-state heat flow, is known as Laplace’s 
equation, and is often written in the abbreviated form 
V 2 u = 0 

As a final example of the derivation of partial differential 
equations from physical principles, we consider the flow of 
electricity in a long cable or transmission line. We assume the 
cable to be imperfectly insulated so that there is both capacitance 
and current leakage to ground (Fig. 8.5). Specifically, let 
x ~ distance from sending end of cable 
e(x,t) — potential at any point on cable at any time 
i(x, t) — current at any point on cable at any time 
R — resistance of cable per unit length 
L = inductance of cable per unit length 
G — conductance to ground per unit length of cable 
C — capacitance to ground per unit length of cable 
Now the potential at Q is equal to the potential at P minus 
the drop in potential along the element PQ. Hence, referring to the 
equivalent circuit shown in Fig. 8.55, 

e(x + Ax) = e(x) — ( R A x)i — (L Ax) — 

01 

di 

or e(x H- Ax) — e(x) s Ae = — (R A x)i — (L Ax) — 


or finally, dividing by Ax and then letting Ax approach zero, 


de 

dx 


~R,i - 


Likewise, the current at Q is equal to the current at P minus 
the current lost through leakage to ground and the apparent cur- 
rent loss due to the varying charge stored on the element. Hence, 
referring again to Fig. 8.5, 

3p 

i(x + Ax) = i(x) — ( G Ax)e — (C Ax) ~ 

or i(x + Ax) — i(x) = A i ~ —(G Ax)e — (C Ax) ^ 


or 

di 

dx 


finally, dividing by Ax 



and then letting Ax approach zero, 


P Ax Q 


ouurqe A-»ua.u. 

: " ^ ” ■" — 

(a) 

FIGURE 8.5 

A typical element of a transmission line. 


p RAx L Ax y 


e(x) 

i{x) 


G Axe ^CA.r 


( 6 ) 


e(x+ Ax) 
i(x+ Ax) 




PARTIAL DIFFERENTIAL EQUATIONS 


If we differentiate Eq. (17) with respect to x and Eq. (18) 
with respect to t, we obtain 


If we eliminate the term 


I between these two 


satisfies the equation 


By differentiating Eq. (17) with respect to t and Eq. (18) 
with respect to a; and then eliminating the derivatives of e, we 
obtain a similar equation for i: 

g-Lcg + tsc + m,) % + aat 

Equations (19) and (20) are known as the telephone equations. 

Two special cases of the telephone equations are worthy of 
note: 

a If leakage and inductance are negligible, that is, if G = L - 0, 
as they are, for example, for coaxial cables, Eqs. (19) and (20) 
reduce, respectively, to 

52 « _ T> r ^ 

W~ RC Jt 


These are known as the telegraph equations. Mathematically, 
they are identical with the one-dimensional heat equation, 
that is, the equation to which (14) reduces when there are no 
heat sources in the conducting region and the temperature 
depends only on one space coordinate, 
b At high frequencies the factor introduced by the time differen- 
tiation is large. Hence the terms involving e and ~ or i and 
are insignificant in comparison with the terms containing the 
corresponding second derivatives ~ and In this case 
Eqs. (19) and (20) reduce, respectively, to 


(22 b) 


SEC. 8.2 


THE DERIVATION OF EQUATIONS 


293 


Each of these is an example of the one-dimensional wave equation 
[Eq. (3)], 1 Is/LC having, in fact, the dimensions of velocity. 
These equations are obtained at any frequency, of course, if 
R = G = 0. 

It is interesting to note that nowhere in the derivation of any 
of the preceding equations was any use made of boundary condi- 
tions. In other words, the same partial differential equation is 
satisfied by a vibrating beam, for instance, whether the beam is 
built-in at one end and free at the other, built-in at both ends, or 
simply supported at both ends. Similarly, the flow of heat in a 
body is described by the same equation whether the surface is 
maintained at a constant temperature, insulated against heat loss, 
or allowed to cool freely by conduction to the surrounding 
medium. In general, as we shall soon see, the role of boundary 
conditions, for example, permanent conditions of constraint or of 
temperature, is to determine the form of those solutions of a partial 
differential equation which are relevant to a particular problem. 
Subsequent to this, the initial conditions of displacement, velocity, 
or temperature, say, determine specific values for the arbitrary 
constants appearing in these solutions. 

EXERCISES 

1 Supply the details of the derivation of Eq. (4) for the transverse vibrations of a membrane. 

2 What is the form of the heat equation if the thermal conductivity k and the specific heat c 
vary from point to point in the body? 

3 Consider the telephone equations in the so-called distortionless case when RC — LG, and 
put a 2 = JIG and a 2 » 1/LC. Provo that if e(x,l) [or, equally well, i(x,t) ] is written in the 
form e(x,t) = 6~ avt y(x,t), then the function y satisfies the wave equation 

V dx 2 =a at 2 

(Note: To avoid confusion with the voltage, <= is here used in place of e to denote the base 
of natural logarithms.) 

4 Derive the partial differential equation satisfied by the concentration u of a liquid diffusing 
through a porous solid. (Hint: The rate at which liquid diffuses through an area is propor- 
tional to the area and to the concentration gradient normal to the area.) 

5 Consider a region of space filled with a moving fluid. Let the density of the fluid at the 
point (x,y,z) at time t be p(x,y,z,t), and let the particle instantaneously at the point (x,y,z) 
have velocity components v x , v u , and v z , respectively, in the directions of the coordinate 
axes. By considering the flow through an infinitesimal region of dimensions Ax, Ay, A z, show 
that the velocity components satisfy the so-called equation of continuity: 

•HpVx) O(pVy) + d(pVz) _j_ _ 0 

dx dy dz fit 

6 If u(x,t) is the displacement of a general cross section of a bar which is vibrating longi- 
tudinally, show that 


294 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


where A (x) is the cross-sectional area of the shaft, E is the modulus of elasticity of the 
material of the shaft, and p is the weight per unit volume of the material. (Hint: Use the 
definition of the modulus of elasticity, 

g _ stress ■ force/unit area 
strain stretch/unit length 
to obtain the expression 


for the force transmitted through a general cross section of a stretched bar.) 

7 a Show that Laplace’s equation in three dimensions 

8% dhi d 2 u _ q 

Ox 2 dy 2 dz 2 

is satisfied by the function 

1 _ 

s / (x - a) 2 + (y — b) 2 - + (z — c) 2 
for all values of the constants a, b, c. 
b Determine whether Laplace’s equation in two dimensions 

d 2 u , d 2 u 

o 

dx 2 dy 2 

is satisfied by the function 

1 

V(* - «)* + (y - &)* 

8 Show that Laplace’s equation in two dimensions is satisfied by the function 

u = In [(* — a) 2 + (y — b) 2 ] 
for all values of the constants a and b. 

9 Show that, if z\(x,y) and Z 2 (x,y) are solutions of the equation 

Pi(x,y) + pt(x,y) ~~ + pz(x,y) ~ + qi(x,y) p + q ^x,y) — + ri(x,y)z = 0 
dx 2 dx dy dy 2 dx dy 

then for all values of the constants c\ and ci the expression c\Zi(x,y) -f Cj3i(x,y) is also a 
solution. 

10 Show that, when Laplace’s equation in cartesian coordinates 
dhi , d 2 u d 2 u . 

j _| — Q 

dx 2 dy 2 dz 2 

is transformed into cylindrical coordinates by means of the substitutions x — r cos 0, 
y — r sin 0, z — z, it becomes 

d 2 u 1 dw 1 dht d 2 u _ 
dr 2 + rdr + r 2 dO 2 + Hz 2 ~ 


8.3 

The D’Alembert solution of the wove equation 

Each of the partial differential equations we encountered in the 
last section can be solved by a method of considerable generality 


SIC. 8.3 


THE D’ALEMBERT SOLUTION OF THE WAVE EQUATION 


( 1 ) 


(2) 


295 


known as separation of variables. For the one-dimensional wave 
equation, however, there is also an elegant, special method 
known as D’Alembert’s solution* which, because of the impor- 
tance of this equation, we shall examine in some detail before 
developing more general techniques. 

The whole matter is very simple. In fact, if / is a function 
possessing a second derivative, then 


df(x — at) 
dt 

d 2 fjx — at) 
dt 2 


= —af(x — at) 
a 2 fix — at) 


dfjx — at) 
dx 


■■ f{x - at) 


d 2 fjx - 


— = fix - at) 


and from these results it is evident that y — fix — at) satisfies the 
equation 


= nS&y 
dt 2 dx 2 


It is an equally simple matter to prove that, if g is an arbitrary 
twiee-differentiable function, then y = gix + at) is likewise a 
solution of (1). Hence, since (1) is a linear equation, it follows 
that the sum 

y = fix — at) + gix + at) 

is also a solution. In fact, it can be shown (see Exercise 10) that 
if / and g are arbitrary twice-differentiable functions, then (2) is a 
complete solution of (1). 

This form of the solution of the wave equation is especially 
useful for revealing the significance of the parameter a and its 
dimensions of velocity. Suppose, specifically, that we consider the 
vibrations of a uniform stringf stretching from — « to ■ °o . If its 
transverse displacement is given by (2), we have in fact two waves 
traveling in opposite directions along the string, each with velocity 
a. For consider the function fix — at). At t — 0, it defines the 
curve y = fix), and at any later time t — h, it defines the curve 
y = fix — ah). But these two curves are identical except that the 
latter is translated to the right a distance equal to ah Thus the 
entire configuration moves along the string without distortion a 
distance of ah in h units of time. The velocity with which the 
wave is propagated is therefore 



* Named for the French mathematician Jean le Rond D’Alembert (1717— 
1783). The D’Alembert solution is actually not a special method but rather a 
special application of a general procedure known as the method of characteris- 
tics. Unfortunately, this cannot be applied with comparable simplicity to 
problems involving the heat equation and Laplace’s equation, and so, despite 
its theoretical interest, we shall not discuss it here. An introduction to the 
theory can be found in Arnold Sommerfeld, “Partial Differential Equations 
in Physics,” pp. 36-43, Academic Press Inc., New York, 1949. 
t The use of the string as an illustration is purely a matter of convenience, 
and any quantity satisfying the wave equation possesses the properties 
developed for the string. 


296 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


Similarly, the function g{x + at) defines a configuration which 
moves to the left along the string with constant velocity a. The 
total displacement of the string is, of course, the algebraic sum of 
these two traveling waves. 

To carry the solution through in detail, let us suppose that 
the initial displacement of the string at any point x is given by 
and that the initial velocity of the string at any point x is 
6(x). Then, as conditions to determine the form of / and g, we 
have, from (2) and its first derivative with respect to t, 

(3) y{x, 0) = 4>{x) = fix) + g(x) 

(4) m L " e(x) = ~ af ' (x) + ^ 

Dividing Eq. (4) by a and then integrating, we find 
Six) + g(x) = ^ f* d(x) dx 

Combining this with Eq. (3) and introducing the dummy variable 
s in the integrals, we obtain 

/(*) - g [+(*)-;X’ *(*)*] 

g(x)-i [«i) +i/ i ‘e( 8 )&] 

With the forms of / and g known, we can now write 
y = fix - at) + g(x + at) = 2~~ ~ ^ f x ] * *(«) 

+[^+^xr*H 

or, combining the integrals, 

(5) vM = r . «0 + *fe ±.g) + JL £+- ((s) ds 

EXAMPLE 1 

A string stretching to infinity in both directions is given the initial displacement 

six) = 1 f 

1 + 8a; 8 

and released from rest. Determine its subsequent motion. 


t The initial deflection curve y = clearly violates assumption d, Sec. 
8.1, since at x = — J4 (for instance), <t>'(x) = tan a = x % ~ 1.78 while 
sin a = 0.87. This difficulty can easily be overcome, however, by assuming 
instead of <f>(x) a new deflection curve 



where & is a sufficiently large constant, say 1c = 10,000. Using <l>(x) instead of 
in this and in similar problems is just a convenient way of eliminating 
the constant factor 1 /k at each step of our work. 


SEC. 8.3 


THE D’ALEMBERT SOLUTION OF THE WAVE EQUATION 


297 


Since 0(x) = 0, we have from (5) simply 

, , x ~ at) + <j>{x + at) 

y(x,t) 5 




[1 + 8(a: — at) 2 1 + 8(a + at)\ 
The deflection of the string when at = 0.0, 0.5, and 1.0 is shown in Fig. 8.6. 




The motion of a semi-infinite string whose end is fixed is 
completely equivalent to the motion of one-half of a two-way 
infinite string having a fixed point, or node, located at some 
finite point, say the origin. To capitalize on this fact we need only 
imagine the actual string, stretching from 0 to °o , to be extended 
in the opposite direction to — <» . The initial conditions of velocity 
and displacement for the new portion of the string we define to be 
equal in magnitude but opposite in sign to those given for the 
actual string.* The solution for the resulting two-way infinite 
string can be written down at once, using Eq. (5) . In the nature of 
the extended initial conditions, the displacement at the origin due 
to the wave traveling to the right from the left half of the string 


* This method of extending the initial conditions is sufficient but not neces- 
sary (see Exercise 6). 


298 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


will always be equal but opposite in sign to the displacement 
at the origin due to the wave traveling to the left from the right 
half of the string. Hence the string will always remain at rest at 
the origin, and the solution for the right half of the extended string 
will be precisely the solution of the original problem. 


EXAMPLE 2 

A semi-infinite string is given the displacement shown in Fig. S.7a and released from rest. 
Determine its subsequent motion. 

FIGURE 8.7 
A semi-infinite 
string and its 
conceptual ex- 
tension. 


La. 


(a) 


_/V 


( 6 ) 


We first imagine the string extended to — w and released from rest in the extended initial 
configuration shown in Fig. 8.7 b. Since 0(x) m 0, we have, from (5), 

4 >(x - at) + 4>(x + af) 
y(x,t) = 


where 4>(x) is the displacement function shown in Fig. 8.7b.* We thus have two displacement 
waves, each of shape defined by \i4>{x), one traveling to the right and one traveling to the left 
along the string. Plots of these waves are shown in Fig. 8.8. An inspection of these configurations 
reveals the important fact that a displacement wave is reflected from a fixed or "closed” end 
without distortion but with reversal of sign. 

The motion of a finite string can be obtained as the motion of 
a segment of an infinite string with suitably defined initial dis- 
placement and velocity. If the string is given between 0 and l, say, 
we first imagine that it is extended from 0 to —l with initial condi- 
tions which are equal but opposite in sign to those for the actual 
string. Then we extend the string to infinity in each direction 
subject to initial conditions which duplicate with period 21 the 
initial configuration between — l and l. 


EXAMPLE 3 

A string of length l is given the displacement shown in Fig. 8.9 and released from rest. Determine 
its subsequent motion. 


* If, as suggested by Fig. 8.7a, the graph of 4>(x) has one or more corner 
points, then, strictly speaking, <j>{x) does not describe an admissible initial 
displacement function. In fact, in the derivation of Eq. (5) both f(x) and 
g(x) were assumed to be twice differentiable, and, therefore, 4>(x) must also 
be twice differentiable, which is not the case if there are points where the 
derivative of 4>(x) is undefined. The apparent solutions obtained from Eq. (5) 
by overlooking this fact are, therefore, at best only formal solutions, and 
are to be viewed with suspicion unless and until it, is verified directly that 
they satisfy the given partial differential equation and its accompanying 
boundary and initial conditions. Questions concerning the existence ana 
uniqueness of solutions of partial differential equations are quite difficult, 
and in our work we shall be concerned mainly with techniques for obtaining 
formal solutions. For an extended discussion of the problem of establishing 
the validity of solutions derived by purely formal means see, for instance, 
R, V. Churchill, "Fourier Series and Boundary Value Problems,” 2d ed., pp. 
126-163, McGraw-Hill Book Company, New York, 1963. 


SEC. 8.3 


THE D’ALEMBERT SOLUTION OF THE WAVE EQUATION 


299 


FIGURE 8.8 
Plot showing 
the propagation 
of a disturbance 
along a semi- 
infinite string. 



The necessary extension of the string and one half cycle of its motion are shown in Fig. 8.10. 
An inspection of Fig. 8.10 shows that the period of the motion, i.e., the least time for its return 
to its initial state, is Just the time for either of the traveling waves to traverse a distance 21. In 
other words, since the velocity of the waves is a, the period is 21 /a. The frequency of the vibra- 
tions is therefore a/2 1. We shall encounter this formula again when we solve the wave equation 
by the method of separation of variables. 

FIGURE 8.9 
A finite string 
with initial 
displacement. 



EXERCISES 

1 A uniform string stretching from — *> to « is given the initial displacement 


V&,0) 


1 — \x\ a: 2 < 1 

0 x 2 1 


and released from rest. Find the displacement of the string as a function of .r and t, and plot 
the displacement curves for at = Hr Vi, H, I- What is the transverse velocity of the string 
at x — 0? 

2 Criticize the following argument “proving” that a string displaced as in Exercise 1 and 
released from rest will remain motionless: “At t = 0 the displacement curve of the string 
consists exclusively of segments of straight lines, and at all points of any line, or segment 

dhj 

of a line, y — ax + b, it is obvious that d 2 y/dx 2 = 0. Hence, at t ~ 0 we have — = 0, and, 


therefore, from the wave equation 


i )P a dx 2 


it follows that, when t 


0, the acceleration — is zero at all points of the string. But if a 
’ dt 2 


300 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


particle has zero velocity and if there is no acceleration, i.e., if there is no change in velocity, 
the velocity remains zero. Therefore the string will never move.”(!) 

3 A uniform string stretching from — « to « is given the initial displacement 

i 

I cos X X* < — 

y(x,0) = | 

0 X s § ~ 

and released from rest. Find the displacement of the string as a function of x and t, and 

, , , „ IT ir 3 jt 

plot the displacement curves for at = -> -> — • 

& A uniform string stretching from- — » to «, while at rest in its equilibrium position, is 
struck in such a way that the portion of the string between x = — 1 and x = 1 is given a 
velocity of 1. Find the displacement as a function of x and t, and plot the displacement 
curves for at = 1 and 2. 

6 A uniform string stretching from 0 to « is initially displaced into the curve y - xe~ * and 
released from rest. Find its displacement as a function of x and f. 

6 A uniform string stretching from Oto » begins its motion with initial displacement ) 

and initial velocity B(x). Show that its motion can be found as the motion of the right half 
of a two-way infinite string, provided merely that the initial displacement and the 

initial velocity 6(—x) for the negative extension of the string satisfy the condition 

1 rx 

$(x) + 4>(-x) = — - J_ x B(s) ds 

7 If a semi-infinite string begins its motion with initial displacement </>(*) = (sin x)/a and 
initial velocity 9{x) — 1 and if the negative extension of the string is imagined to have the 
initial displacement </>(— x) = 0, find the necessary initial velocity for the extended portion 
of the string. 


-I 0 _ l 




uA 


f\\ 


V" 



/\ 


X/ Vy 




c 



VC ^ 

C>/ 



AX, — 


\/~ 

/C >V 


A 

VT~V 

?\ 

\/ 

V 


v~ 




7\ /v 


/C 


— V 


V V 




-l 0 l 


FIGURE 8.10 

Plot showing one half cycle of the motion of a finite string. 


SEC. 8.3 


THE D’ALEMBEBT SOLUTION OF THE WAVE EQUATION 


301 


8 The initial displacement of a two-way infinite string is 
y(x, 0) = 7- — — - x 2 < 00 


With what velocity must the string start to move in order that its subsequent motion will 
consist solely of a wave traveling to the right? 

9 The initial velocity of a two-way infinite string is 


y(x, 0 ) = 


sm x 
0 


X 3 < ir 2 
S 2 v 2 


From what initial displacement must the string start to move in order that its subsequent 
motion will consist solely of a wave traveling to the right? 

d 3 y d 3 y 

10 Show that, under the substitutions u — x — at and v ~ x + at, the equation — 2 — 


■ at 3 


ax 3 


d 2 y 


tion of the one-dimensional wave equation. 

11 Discuss the possibility of finding solutions of the form z = /(Xx + y ) for the equation 


, 5 2 « , „„ d 2 z , „d 3 z 

A — ; + 2B (- C — - 

6x 2 Ox dy dy 3 


A, B, C constants 


and show that, according as B 3 — AC is greater than, equal to, or less than zero, there will 
be two, one, or no (real) values of X for which such' solutions exist. (The given equation is 
said to be hyperbolic, parabolic, or elliptic in the respective cases, and the nature of its 
solutions and their properties is significantly different in each case.) 

12 The equation 


d 3 z a 2 z d 3 z 

A(x,y ) — + 2B(x,y) — + C(x,y) — = / 
Ox 2 dx dy dy 2 


/ dz dz \ 

(*' 51 ’ 5 ? *7 


is said to be hyperbolic, parabolic, or elliptic at a point (x,y) according as B 3 (x,y) 
A(x,y)C(x,y ) is, respectively, greater than, equal to, or less than zero. For what values 
of x and y is the equation 

d 2 z d 2 z d 3 z 

(1 - y) f- + 2(1 “ *)~f + (1 + »)— - 0 
dx 3 dx dy dy 3 


hyperbolic? parabolic? elliptic? 
18 Let 


c)%z d*z 

A M - + 2B(x,y)— + C(x,y)--f 


( dz dz \ 

I z _, X y I 

\ dx ay J 


and let <!>(x,y) = ci and ip(x,y) — ci be the functions whose derivatives satisfy the equation 
A(x,y)(w') 3 — 2B(x,y)w' + C(x,y) = 0 

These functions are called the characteristics of the given partial differential equation, 
and, if B 3 (x,y) — A(x,y)C(x,y) S 0, they define families of (real) curves which are called 
characteristic curves. 

a What are the characteristic curves of the one-dimensional wave equation? 
b If the given partial differential equation is hyperbolic, show that the change of inde- 
pendent variables defined by the substitutions u = <j>(x,y), v ~ >/'(x,y) will reduce it to the 


„( dz dz \ 

p (’ , 5?5’“’7' 


PARTIAL DIFFERENTIAL EQUATIONS 


c If the given partial differential equation is parabolic, show that the change of inde- 
pendent variables defined by the substitutions u — x, v = 4 >{x,y) will reduce it to the 

standard form 


dv 2 


( dz dz \ 

f ' I 2, > — » U,V )■ 

y du dv J 


d If the given partial differential equation is elliptic, show that the change of independent 
variables defined by the substitutions u + iv ~ </>(x, y), u — iv = will reduce it to 

, , , 5*3 5*3 / 53 53 ,\ 

the standard form — - -1 — F [ z, — »•—»«,» J- 

du 2 5«* \ 5a dv J 

Using the substitutions described in the preceding exercise, solve each of the following 
equations: 


a Zx X + 3 z xv + 2 z yv = 0 b s„ + 4 z xv + 43„„ = 0 

C Zxx + 4 z xy + 5 z vv =0 d xz xy + yz yv = 0 

e xz xv — yz V y = Zy f z x x + 2(x + ]j)Zxy + 4 ryz yv — 0 

IB a Discuss the possibility of extending the D’Alembert solution to the two-dimensional 
wave equation a 2 (z xx + z yv ) — z u . 

b Discuss the possibility of finding solutions of the form e x * + w for the equation 

Az xx + Bz xv + Cz V y + Dz x + Ez y + Fz = 0 A, B, C, D, E, F constants 


Separation of variables 

We are now ready to consider the solution of partial differential 
equations by the method of separation of variables. Although 
this method is not universally applicable, it suffices for most of 
the partial differential equations encountered in elementary appli- 
cations in engineering and in physics and leads directly to the 
heart of the branch of mathematics which deals with boundary 
value 'problems. 

The idea behind the method is the familiar mathematical 
stratagem of reducing a new problem to dependence upon an 
old one. In this case we attempt to convert the given partial 
differential equation into several ordinary differential equations, 
hopeful that what we know about the latter will prove adequate 
for a successful continuation. 

To illustrate the details of the procedure, let us again con- 
sider the wave equation, this time taking the torsionally vibrating 
shaft of finite length as a specific representation : 

d 2 6 . d 2 6 

di 2 a dx 2 

We assume, as a working hypothesis, that solutions for the angle 
of twist 6 exist, as products of a function of x alone and a function 
of 2 alone: 


e(x,t) = X(x)T(t) 


SEC. 8.4 


SEPARATION OF VARIABLES 


303 


If this is the case, then partial differentiation of 8 amounts to 
total differentiation of one or the other of the factors of 8, and 

we have ~ = X"T and ~ = XT", 
ax* dr 

Substituting these into the wave equation, we obtain 
XT" - a*X"T 
Dividing by XT then gives 



as a necessary condition that 0(x,t) = X(x)T(t) should be a 
solution. 

Now the left member of (1) is clearly independent of x. 
Hence (in spite of its appearance) the right-hand side of (1) must 
also be independent of x, since it is identically equal to the expres- 
sion on the left. Similarly, each member of (1) must be independ- 
ent of t. Therefore, being independent of both x and t, each side 
of (1) muBt, be a constant, say n, and we can write 



Thus the determination of solutions of the original partial differ- 
ential equation has been reduced to the determination of solutions 
of the two ordinary differential equations 

T" = jxT and X" = A, X 
a* 

Assuming that we need consider only real values of a, there 
are three cases to investigate : 
m>Q m = 0 m<0 

If a > 0, we can write n = X 2 . In this case the two differen- 
tial equations and their solutions are 

T" - V-T X" = X 

o 2 

T = Ae u + Be~ u X = Ce Ma + De~ x * /o 
But a solution of the form 

8(x,t) = X(x)T(t) « (Ce^ la + De~ u/a ) (Ae u + Be~ u ) 
cannot describe the undamped vibrations of a system because it 
is not periodic, i.e., does not repeat itself periodically as time 
increases. Hence, although product solutions of the differential 
equation exist for ju > 0, they have no significance in relation 
to the problem we are considering. 

If ja = 0, the equations and their solutions are 

T" = 0 X" » 0 

T = At + B X = Cx + D 


304 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


But, again, a solution of the form 
6(x,t) = X(x)T(t ) = (Cx + D)(At + B ) 


cannot describe a periodic motion. Hence, the alternative m = 0 
must be rejected. 

Finally, if n < 0 we can write n = — X 2 . Then the compo- 
nent differential equations and their solutions are 

T" = -X 2 T X" = - X 

a 2 

T — A cos \t + B sin Xf X - C cos - x + D sin - x 
a a 

In this case the solution 

(2) B(x,t) = X{x) T(t) = (c cos £ rc + D sin | x} (A cos \t + B sin Xf) 


is clearly periodic, repeating itself identically every time t in- 
creases by 2ir/X. In other words, B(x,t) represents a vibratory 
motion with period 2v/\ or frequency X/2 v. 

It remains now to find the value or values of X and the con- 
stants A, B, C, and D. Since the admissible values of X are deter- 
mined by the boundary conditions of the problem, the continua- 
tion now varies in some respects, depending upon how the shaft 
is constrained at its ends. We shall discuss in turn the following 
simple cases (Fig. 8.11): 

(a) 

FIGURE 8.1 1 
End conditions 
for a shaft 
vibrating tor- 
sionally: (a) 
fixed-fixed; (b) 
free-free; (c) 
fixed-free. 

(C) 



(b) C 


8 =^ 


a Both ends of the shaft are built-in, i.e., are constrained so 
that no twisting can take place, 
b Both ends of the shaft are free to twist, 
c One end of the shaft is built-in; the other is free to twist. 

If both ends of the shaft are held fixed, we have the following 
conditions to impose upon the general expression for 6(x,t), assum- 
ing the rr-axis chosen along the shaft so that the left end of the 
shaft is at x = 0 and the right end is at a: = l: 

0(0, t) = 0(1, t) = 0 identically in t 
Substituting x = 0 into the expression (2), we find 
0(0, f) = 0 — C(A cos \t + B sin Xf) 


SEC. 8.4 


SEPARATION OF VARIABLES 


(3) 


This condition will obviously be fulfilled for all values of t if both 
A and B are zero. In this ease, however, 6(x,t) is zero at all times 
and the shaft remains motionless, a possible but trivial solution 
in which we have no interest. Hence we are driven to the other 
alternative, C — 0, which reduces (2) to the form 

0(x,t) - D sin ^ x (A cos \t + B sin \t) 

The second boundary condition, namely, that the right end 
of the shaft remains motionless at all times, requires that 

0(1, t) 0 = D sin ^ (A cos \t -f B sin X2) 

As before, we reject the possibility that A = B = 0, since it leads 
only to a trivial solution. Moreover, we cannot permit D = 0, 
since that, too, with C already zero, leads to the trivial case. The 
only possibility which remains is that 

. \l A \l 

sin — = 0 or — = 7i7r 
a a 

From the continuous infinity of values of the parameter X 
for which periodic product solutions of the wave equation exist, 
we have thus been forced to reject all but the values 


These and only these values of X (still infinite in number, how- 
ever) yield solutions which, in addition to being periodic, also 
satisfy the end, or boundary, conditions of the problem at hand. 
With these solutions, one for each admissible value of X, we must 
now attempt to construct a solution which will satisfy the remain- 
ing conditions of the problem, namely, that the shaft starts its 
motion at t = 0 with a known angle of twist 0(x, 0) = f(x) and a 

known angular velocity — g(x) at every section. 

Now the wave equation is linear, and, thus, if we have 
several solutions, their sum is also a solution. Hence, writing 
the solution associated with the nth value of X in the form 

O n (x,t) — sin — x (A n cos \ n t + B n sin X„i) 

. nirx ( . meat . „ . nirat\ f 

= sm ~~ cos —j — |- B n sm — — J 

it is natural enough (though perhaps optimistic, in view of the 
questions of convergence that are raised) to ask if an infinite 


t The constants A and B now bear subscripts to indicate that they are not 
necessarily the same in the solutions associated with the different values of X. 
The constant D can, of course, be absorbed into the constants A and B 
and need not be explicitly included. 



PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


306 


series of all the 0 n ’s, say 

... n . . V • mrx / . rnrat , n . nirai\ 

(4) 0(x,t) = £ 8n(x,t) = I sin ~j~ l An eos —j h B n sin 

can be made to yield a solution fitting the initial conditions of 
angular displacement and velocity. 

This can be done, and in fact in this case the determination 
of the coefficients A n and B n requires nothing more than a simple 
application of Fourier series, as developed in Chap. 6. For, if 
we set t — 0 in $(x,t), we obtain from Eq. (4), and the given initial 
displacement condition, 

B(x,0) m f(x) = J An sin —■ 

The problem of determining the A„’s so that this will be true is 
nothing but the problem of expanding a given function f(x) in a 
half-range sine series over the interval (0,1). Using Theorem 2, 
Sec. 6.4, we have explicitly 

A n = j sin —■ dx 

. dd V • wir* / . . mrat , n nirat\ nira 

Also ’ si “ 2 i sm -rf- A " sm -r + B " cos -r / )-r 

Hence, putting t = 0, we have from the initial velocity condition, 
<50 | , . V ( n*a d \ • nvX 

This, again, merely requires that the B n ’ s be determined so that 
the quantities 
mra „ 

~r B ” 

will be the coefficients in the half-range sine expansion of the known 
function g(x). Thus 

mra „ 2 ri . . nirx , n 2 n .... . nirx , 

-y- Bn — j I g(x) am — j— dx or B n = — I g(x) sin -7- dx 

l l Jo " ' l mra Jo v l 

Aside from convergence questions, our problem is now com- 
pletely solved. We know that a uniform shaft with both ends 
restrained against twisting can vibrate torsionally at any of an 
infinite number of natural frequencies, 

fn ~ cycles/unit time n = 1, 2, 3, . . . 

If and when the shaft vibrates at a single one of these frequencies, 
we know that the angular displacements along the shaft vary 


8.4 


SEPARATION OF VARIABLES 


307 


periodically between extreme values proportional to 


sin 


n-irx 

l 


Finally, assuming any initial conditions of velocity and displace- 
ment which satisfy the Dirichlet conditions, we know how to 
construct, at least formally,* the instantaneous deflection curve 
as an infinite series of the deflection curves associated with the 
respective natural frequencies A„. 

The treatment of the shaft with both ends free follows closely 
the preceding analysis, once we obtain the proper analytic formu- 
lation of the end conditions. To obtain this formulation, we 
observe that at a free end, although we do not know the amount of 
twist, we do know that there is no torque acting through the end 
section. Recalling from the discussion of Sec. 8.2 the expression 
for the torque transmitted through a general cross section of a 
twisted shaft, we thus find the free ends characterized by the 
requirement that 


EJ~ I 

OX lend 


= 0 


Since E a is a nonzero constant of the material of the shaft and 
since J cannot vanish for a shaft of uniform section such as we 

are considering, it follows that at a free end = 0. 

Returning to the original product solution (2), we find that 

^ ^ — C - si n-x + D- cos - (A cos \t + B sin \t) 

dx \ a a a a / 

Substituting x — 0 and equating the result to zero, we obtain the 
condition 


^ D{A cos \i + B sin U) = 0 for all t 

and from this we conclude that D - 0. Substituting x = l and 
again equating to zero, we find 

— C - sin — ( A cos \t + B sin AO = 0 

(X CL 

Since we cannot permit C — 0, we must have 

. XI _ ‘U 

sm — = 0 or — = rnr 
a a 

Thus, as in the last example, to have the end conditions of the 
problem fulfilled, A must be restricted to one of the discrete set of 
values 


See the footnote to Example 2, Sec. 8.3. 


308 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


(5) 


(6) 


Again, we construct the product solution for each admissible 
value of X : 

On{x,t) = ^cos ~ xj {A n cos \ n t + B n sin X„f) 

nirx ( . nirat . „ . mrat\ 

— cos -j~ ( An cos — j h B n sm—j—J 

and attempt to form an infinite series of these solutions, 

. v „ . . v nirx ( . nirat . D . nirat\ 

6{x,t ) = ) 9n(x,t) — cos ~Y~ ( cos ~ J sm—j—J 

n = 1 re -1 ' ' 

which will satisfy the initial displacement condition 0(.e,O) = f(x) 
and the initial velocity condition ~ di x )- 

To satisfy the initial displacement condition, we must have 

0(a:,O) ss f(x ) = ^ An cos 

which requires that the A„’s be the coefficients in the half-range 
cosine expansion* of the known function /(a:), that is, that 

, 2 n .. \ nirx , 

An * j J 0 fix) cos — dx 

To satisfy the initial velocity condition, we must have 


301 , . V (mra D \ nirx 

B ”) oos -r 

which requires that the quantities 
mra „ 

T 8 ' 

be the coefficients in the half-range cosine series* for g{x) over the 
interval {0,1), that is, that 

mra D 2 n . . nirx , D 2 /i M nirx , 

— r~ B„ = r g{ X) cos — j— dx or = / <7(01) cos dx 

l l Jo 9 l mra Jq l 

We note in passing that, since the admissible X ! s are the same 
for the free-free shaft and the fixed-fixed shaft, the natural fre- 
quencies of the two systems are the same. The amplitudes through 
which they vibrate are not the same, however. In fact, for the 
fixed-fixed shaft we found the distribution of amplitudes along the 
shaft given by the function sin {mrx/l), whereas for the free-free 
shaft the amplitudes are given by cos {nirx /l). 

The case of the shaft with one end fixed and the other free 


* In general, the half-range cosine expansion of a function begins with a 
constant term. This series does not, because we rejected earlier the possibility 
ti = 0, which would have led to such a term. Had there been an acceptable 
product solution corresponding to m = 0 we would, of course, have had to add 
it to the solutions arising from the assumption p = — \ 2 when we constructed 
the infinite series for 6{x,t). (See Exercise 1.) 


SEC. 8.4 


SEPARATION OP VARIABLES 


309 


can be disposed of quickly. Taking the fixed end at x — 0 and the 
free end at x = l, we have the two conditions 

0(0, t) — 0 and ~ I =0 for all t 
dx \i,t 

Imposing these upon the general product solution (2) gives 
C(A cos \t + B sin XZ) = 0 or C = 0 

and - D cos — (A cos \t -f- B sin \t) - 0 
a a 

from which we conclude that 

^ o ^ (2» — 1 )t j c u > (2n — l)a7r 

d CL 2 2>l 

The general solution of the problem, formed by adding together 
the product solutions corresponding to each X», is therefore 

0(x,t) — ^ sin ~ x (A,, cos \ n t + JS„ sin \ n t) 

V ■ (2» — 1)«: f . (2n — l)ira£ . „ . (2n — l)7roil 

- jt sm 2 1 [■ 4 - 008 Sr- + sm -- g" J 

To fit the initial displacement condition 9(x, 0) — f(x), we 
must have 

m = | 

This is not quite the usual half-range sine expansion problem, 
since the arguments of the various terms are not integral multiples 
of the fundamental argument irx/l. It is, however, the special 
half-range sine expansion over (0,1) discussed in Exercise 13, 
Sec. 6.3, where the formula for the coefficients was shown to be 

A, = r ] o msm s dx 

Similarly, to fit the initial velocity condition 0 = g(x), 
we must have 


, a V f (2n — l)7ra „ 1 . (2 n — l)7rx 
»<*> ” [ 21 B “ | 3m 21 

which requires that 


B n = 


(2 n — l)air 


g(x) si] 


(2 n — 1)ttx 
1 21 0 


EXERCISES 

1 Discuss the restrictions implicitly imposed on f(x) and g(x) by the absence of constant terms 
in the series in Eqs. (5) and (6). What is the physical significance of these restrictions? 

2 Verify that the solutions of the wave equation obtained in this section can all be written in 
the form 0(x,t) = F{x — at) + G{x + at), as required by the D’Alembert theory. 


310 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP, 8 


Which of the following equations can be solved by the method of separation of variables? 
Where possible, determine the product solutions. 


a (- i, [- c — = 0 

dx- dx dy dy 

+ b — + c — — = 0 

dx dy dy 2 


6a: 2 


f 


+ b 


dz 2 

. 9u 
+ c-~ + d 
dx 


dy 


* 0 


A uniform shaft, fixed at one end and free at the other, is twisted so that each cross section 
rotates through an angle proportional to the distance from the fixed end. If the shaft is 
released from rest in this position, find its subsequent angular displacement as a function 
of x and t. 

A uniform shaft, fixed at each end, is twisted so that each cross section rotates through an 
angle proportional to x(l — x) where l is the length of the shaft and x is the distance from 
the left end. If the shaft is released from rest in this position, find its subsequent angular 
displacement as a function of x and t. 

A uniform shaft, free at each end, is twisted so that each cross section rotates through an 
angle proportional to (2x — Z)/2, where Z is the length of the shaft and x is the distance 
from the left end. If the shaft is released from rest in this position, find its subsequent 
angular displacement as a function of x and t. 

Show that the natural frequencies of a uniform string are given by the formula 

S 


where l is the length of the string, T is the tension under which it is stretched, and w is its 
weight per unit length. How does doubling the tension affect the pitch of the fundamental 
tone of the string? Why is it that most string instruments either have strings of different 
lengths or have the lengths of their strings changed by the performer as he plays? 

A uniform string, stretched between the points (0,0) and (/, 0), is given the initial dis- 
placement 


2/M) - /(*) = 


l- 


0 < X < - 


- < x < l 


and released from rest. Find its subsequent displacement as a function of x and t. 

While in its equilibrium position, a uniform string, stretched between the points (0,0) and 
(1,0), is given the initial velocity 


2/M) = g(x) -■ 


l - 


0 < x < 

1 


l 


< l 


Find its subsequent displacement as a function of x and t. 

While in its equilibrium position, a uniform string, stretched between the points (0,0) and 
((,0), is given the initial velocity 

' 1 ~ k 
2 

l - k t + k 

— — < x < ~~ 

2 2 

l + k 


0 < x < - 


y(x,0) = y(x) -- 


- < x < l 


SEC. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


311 


11 


12 


13 


Find its subsequent displacement as a function of x and t. Does your answer appear to have 
a meaningful limit as k —> 0? If so, to what problem do you think it is the answer? 

A uniform string, stretched between the points (0,0) and (Z,0) is given the following initial 
displacement and initial velocity: 


y(x, 0) = f(x) 


y(x, 0) = g(x) 


. trX 



0 < 
l 


l 

:< 4 


Find its subsequent displacement as a function of x and t. 

The curved surface of a rod of length l is perfectly insulated against the flow of heat. The 
rod, which .is so thin that heat flow in it can be assumed to be one-dimensional, is initially 
at the uniform temperature 100°. Find the temperature at any point in the rod at any 
subsequent time if both ends of the rod are kept at the temperature 0 n . (Hint: For heat 

. . . . , d 2 w 3 m 

flow in one dimension, the heat equation reduces to — = a 2 — 

dx 2 dt 

Work Exercise 12, with both ends of the rod insulated and the initial temperature dis- 
tribution in the rod given by 


u(*,0) = f(x) — Mo * 0 < x < l 


where x is the distance from the left end of the rod. (Hint: The temperature gradient 
through an insulated surface must be 0.) 

14 Work Exercise 12 with the left end of the rod maintained at the constant temperature 0° 
and the right end perfectly insulated. 

15 Show that the torsional vibrations of any uniform fixed-free shaft of length l are always the 
same as those of the left half of a suitably chosen fixed-fixed shaft of length 21. Is the con- 
verse true? That is, does the motion of the left half of a fixed-fixed shaft of length 21 always 
represent a possible motion of a fixed-free shaft of length Z? 


8.5 

Orthogonal functions and the genera! expansion problem 

The three examples we considered in the last section embody all 
the significant features of the general boundary value problem. 
However, they give an exaggerated picture of the role of Fourier 
series in the final expansion process that is required in order to fit 
the initial conditions. In general, a knowledge of Fourier series, as 
such, will not suffice to obtain the necessary expansion. Hence 
before we attempt to summarize the major characteristics of 
boundary value problems, as illustrated in our examples, we shall 
consider an additional example or two in which Fourier series play 
no part. 

EXAMPLE 1 

A slender rod of length Z has its curved surface perfectly insulated against the flow of heat. Its 
left end is maintained at the constant temperature it = 0, and its right end radiates freely into 


PARTIAL DIFFERENTIAL EQUATIONS 


air of constant temperature u = 0. If the initial temperature distribution in the rod is given by 
u(x, 0) = /Car) 

find the temperature at any point of the rod at any subsequent time. 

Since the rod is very thin and since its lateral surface is perfectly insulated, we shall assume 
that all points of any given cross section are at the same temperature and that the flow of heat 
in the rod is, therefore, entirely in the ^-direction. Thus we have to solve the heat equation 
[Eq. (14), Sec. 8.2] specialized to one-dimensional flow without heat sources: 


(1) 


ahc 

dx* ' 


ot 


At the left end of the rod we have the obvious fixed-temperature condition u(0,t) = 0. At the 
right end we have a radiation condition which must be formulated analytically before we can 
proceed with our solution. 

Now, according to Stefan’s law, the amount of heat radiated from a given area dA in a 
given time interval dt is 

dQ = <r(U 4 — C/o 4 ) dA dt 

where U and C/o are, respectively, the absolute temperatures of the radiating surface and of the 
svirrounding medium and a is a proportionality constant. This quantity of heat must have come 
to the surface by conduction from the interior of the body; hence, we have as a second estimate 
for dQ the expression 

_ , dU , 


dn 


-dA' dt 


where k is the thermal conductivity, — is the temperature gradient in the direction perpendicu- 
On 

lar to dA, and dA' is an element of area, congruent to dA, situated in the body an infinitesimal 
distance from dA in the normal direction. Therefore, equating the two expressions for dQ, we 
have 

ATT 

<r(U* - C/o 4 ) dA dt 

f/o 4 in powers of U — U a , 
o-{[(t7 0 + (C/ - C/o)] 4 - C/o 4 ] 
cr[4C/ 0 , (C/ - c/o) 4- 6C/o s (C/ - C/o) 2 + • • •] 

shall suppose, we can neglect every- 


-k — dA' dt 
dn 


', canceling the common factors and expanding C/ 4 

, au 


Finally, if U — C/ 0 is small in comparison with C/o, 
thing on the right except the first term, getting 


dn 


■ C/o) 


_ 4<rC/ 0 a 
k 


In our problem, the normal to the surface from which radiation takes place, i.e., the right end 
of the rod, is the a-axis. Hence if we measure temperatures from C/ 0 as a reference value, so that 
- U 0l our second boundary condition becomes simply 


» U - 


( 2 ) 


du 1 
’ an L 


dii I 

hu{l,t) 

dx\l,t 


As before, we begin by assuming a product solution u = 
heat equation (1): 


XT and substituting it into the 


X"T = a*XT' 


SEC. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


313 


Dividing by XT, we have 

X^ _ % T 

X ° T 

from which, since a; and t are independent variables, we conclude that 

x" , , r 

— and a 2 — 

must equal the same constant, say n. 

If n > 0, say fj. = X 2 , we have, from the fraction involving T, 


and 


= Ce xtt ' aS 


But this is absurd, since it indicates that the temperature u = XT increases beyond all bounds 
as t increases. Hence we reject the possibility that m > 0. 

If M — 0, we have simply 

X" — 0 T' = 0 

X — Ax A- B T — C 

and, letting C — 1, as we can without loss of generality, 
u — XT = Ax H- B 

For this to be relevant to our problem it must reduce to 0 when x = 0; hence B — 0. Moreover 
it must satisfy Eq. (2) when a; = l; hence A == 0, Thus n — 0 leads only to a trivial solution 
and must also be rejected. 

Finally, if p < 0, say n — —X 2 , the component differential equations and their solutions are 

X 2 „ 


s _x 2 Z 

= A cos Xx + B sin Xa; 


T' - - - T 
T = Ce~^‘ ,aI 


and, again letting C <= 1, 

u = XT = (A cos Xa; + B sin Xa ;)e">' s »“ s 

To fit the left end condition we must have w(0,f) m 0 = Ae~* illai . Hence A = 0, and u reduces 


to u = Be”** 11 ** sin Xa;. To fit the right end condition (2), v 
-Be-V‘i“\ cos XI = hBe-V‘ lat sin Xl 
or, dividing out the exponential and collecting terms, 

B(h sin Xl + X cos Xl) = 0 

If B =» 0, the solution is trivial. Hence we must have 
h sin Xl + X cos Xl -■ 0 
X Xl 

or tan Xl — — - = — — 

h hi 

tan z = — az 


; must have 


or finally 
where 


= Xl 


and 


This equation is not like the simple equations 
sin X/ = 0 and cos Xl = 0 

which determined the admissible values of X in the examples of the last section, and its roots 
cannot be found by inspection. To determine them it is convenient to consider the graphs of the 


314 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP, 8 



two functions 

j/i = tan z and y 2 = —az 

The abscissas of the points of intersection of these curves (Fig. 8.12), being values of z for which 
V\ — Vi, are then the solutions of the equation 

tan z = —az 

Obviously, there are an infinite number of roots z n . However, unlike the roots of sin \l — 0 and 
cos XZ = 0, they are not evenly spaced, although, as the graph in Fig. 8.12 indicates, the interval 
between successive values of z„ approaches x as n becomes infinite. 

From each root z„, we obtain at once the corresponding value of X 



and the associated product solution 

u n (x,t) - T n (t)X n (x) = sin \ n x 

Then we form a series of these particular solutions 

(3) u(x,t) = ^ u n (x,t) — ^ B„e -X » <,0> sin X„z 

and attempt to determine the constants B„ so that the function defined by the series will satisfy 
the initial condition 

u(x, o) - m 

Finally, putting t - 0 in (3), we find that this requires 

(4) u(x,0) m /( x) = ^ B„ sin \ n x 

n=i 

Thus, as in the examples in the last section, to satisfy the initial condition we must be able 
to expand an arbitrary function in an infinite series of known functions, determined by a differ- 
ential equation and a set of boundary conditions. However, although the functions in terms of 
which the expansion is to be carried out are sines, the values of X appearing in their arguments 
are spaced at incommensurable intervals, and so the required series is not a Fourier series. 
Clearly, something is involved which includes Fourier series as a special case but is itself more 
general and more fundamental. 


SEC. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


315 


If we review thoughtfully our earlier discussion of Fourier 
series (Sec. 6.2), it should be apparent that the decisive property 
of the set of functions {cos (mrx/l), sin (mrx/l)) which made it 
possible to determine one by one the coefficients in the assumed 
expansion 


N 1 . . 2ira; . 

f{x) = g a » + a i cos ~i + a 2 cos ~ + * ' * 

it . irx , , • 2nx 

+ sin-j- + f> 2 sin— ^ — h • • • 


was that the integral of the product of any two distinct members 
of the set taken over the appropriate interval is zero. For it was 
this that enabled us to multiply the series for f(x) by cos mrx/l 
or sin mrx/l and eliminate all but one of the unknown coefficients 
simply by integrating from d to d + 21. 

Now, sines and cosines are by no means the only functions 
from which sets can be constructed having the property that the 
integral between suitable limits of the product of two distinct 
members of the set is zero. In fact, the trigonometric functions 
which appear in Fourier expansions are merely one of the simplest 
examples of infinitely many such systems of functions, whose 
existence we shall soon establish. 


DEFINITION 1 

If a sequence of real functions 

{ <f> n (x)\ n = 1, 2, 3, . . . 

which are defined over some interval (a, 6), finite or infinite, has the property that 

f ♦-<*>*•(*> dx { * o m = l 

then the functions are said to form an orthogonal set on that interval. 
DEFINITION 2 

If the functions of an orthogonal set { <t> n (x ) } have the property that 

fb 

/ <£„ 2 (x) dx — 1 for all values of n 

then the functions are said to be orthonormal on the interval ( a,b ). 

Any set of orthogonal functions can easily be converted into an 
orthonormal set. In fact, if the functions of the set {^n(x) j are 
orthogonal and if /c„ is the (necessarily positive) value of 

fb 

I <t>n 2 (x) dx, then the functions 

Mx) <t> z(x) <t> 3 (x) 
a/ A'i \/ k» s/ lc 3 

are clearly orthonormal. It is, therefore, no specialization to 
assume that an orthogonal set of functions is also orthonormal. 


316 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


DEFINITION 3 

If a sequence of real functions { 0 n (x) } has the property that, oyer some interval 
(a, b), finite or infinite, 

“o 2-1 

then the functions are said to be orthogonal with respect to the weight function 
p(x) on that interval. 

Any set of functions orthogonal with respect to a weight function 
p(x) can be reduced to a system orthogonal in t he fir st sense 
simply by multiplying each member of the set by s/ p(x) if, as we 
shall suppose, p{x) ^ 0 on the interval of orthogonality. 

With respect to any set of functions {0„(x)} orthogonal over 
an interval (a, 6), an arbitrary function/(x) has a formal expansion 
analogous to a Fourier expansion, for we can write 
(5) f(x) = arfxix) 4 a- 2 <t> 2 (x) + • • • -f a„0 B (x) + • • • 

Then, multiplying by 0 n (x) and formally integrating between the 
appropriate limits, a and b, we have 

j* f{x)4> n {x) dx = cii j* 01 {x)<}> n (x) dx 

+ a* <t>z(x)<t> n (x) dx + • • • + a n f* <t>J(x) d.x 4 • ■ • 

From the property of orthogonality, all integrals on the right are 
zero except the one which contains a square in its integrand. 
Hence, we can solve at once for a n as the quotient of two known 
integrals: 

/(*) 4>n(x) dx 
an = “Fj 

/ <t> n 2 (x) dx 

However, although the orthogonality of the 0’s makes it possible 
to determine the coefficients in the expansion (5), this property 
is not sufficient to guarantee that this series converges to f(x) or 
even converges at all. 

To pursue this matter a little further, it is convenient to 
introduce the idea of a null function : 

DEFINITION 4 

A real function f(x) is said to be a null function on the interval (a, b) if 
jf P(x) dx = 0 

If f{x) is identically zero, it is obviously a null function. How- 
ever, a null function need not be identically zero. In fact, since the 
area under a curve is not altered by changing the ordinate of the 
curve at one or more isolated points, it is clear that we can have 

fb 

J a f 2 (x) dx = 0 even though /(x) has nonzero values at a finite or 


SEC. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


317 


countably infinite number of points between a and 6. On the other 
hand, if there is any subinterval of (a,b), no matter how short, at 
all points of which /(x) is different from zero, then j a / 2 (x) dx 9* 0 
and f(x) is not a null function. From this it is not difficult to show 
that a null Junction is zero at every point where it is continuous. 

Clearly, any null function is orthogonal to every member of 
an orthogonal set { <f>n(x) j • It is conceivable, also, that a noimull 
function/(x) might be orthogonal to every </>, that is, that we might 
have 

J* fix) <{» n (x) dx — 0 for all values of n 

In such a case, every coefficient in the expansion of /(x) in terms of 
the <p’s would be zero, and the series ( 5 ) would converge to zero at 
all points of (a,h) even though /(x) was not a null function. That 
this is actually possible is easily shown by example. For instance, 
although the functions {sin nx] are readily shown to be orthogonal 
over the interval (— 7r,7r), not every function can be represented on 
this interval by a series of the form 


ai sin x + ai sin 2x + 


+ a„ sin nx + 


In particular, if /(x) = x 2 , we have, for the coefficients in its for- 
mal expansion, 


.£ 


x 2 sin nx dx 


j * sin 2 nx dx 

1 f2x . /x 2 2\ > a 

= - — 5 sin nx — I —z ; J cos nx =0 

7 r[w 2 \w 2 n 3 J J-, 

for all values of n. More generally, since every member of the set 
{sin nx} is odd, it is clear that no series of these functions can 
represent any even function on the interval (—jr, 7 r). 

Evidently, important as it is, orthogonality is not the whole 
story, and the functions in our orthogonal systems must possess 
some further property before the expansion ( 5 ) can be used with 
confidence. What is required is that the set of functions { $„(x) j, 
in addition to being orthogonal, should also possess the property 
of completeness described in the following definition: 

DEFINITION 5 

A set of orthogonal functions { <p n (x) j is said to be complete if the relation 

fb 

l a f(%)<i>n(x) dx = 0 can hold for all values of n only if f(x) is a null function. 

If { <t> n (x) } is a complete orthogonal set, then clearly not all coeffi- 
cients in the expansion of a nonnull function can be zero, and thus 
no nontrivial function can have a trivial expansion. In fact, we 
have the following theorem: 


318 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


THEOREM 1 
If the formal expansion 

a,i<f>i(x) +• 4* * * " “H 0 4- ' ■ ' 

of a function fix) in terms of the members of a complete orthonormal set { <j) n (x ) } 
converges and can be integrated term by term, then the sum of the series differs 
from f(x) by at most a null function; that is, the sum of the series cannot differ 
from fix) over any interval of finite length. 

PROOF By hypothesis, the series £ a n <j> n (x) converges to some function; 

n = l 

hence, it is meaningful to consider the difference 
g(x) - f(x) - J a n (f>nix) 

n-1 

If we can prove that g( x) is a null function, the assertion of the theorem will be 
established. To do this, consider 

J* x) dx = <j> m {x) [/(*) - ^ a n <j> n (x) ] dx 

= £ <t> m (x)f{x) dx - £ 4> m (x) [ ^ a» <*>»(*)] dx 

— j b <j>mix)f(x) dx — ^ a n J b <f) m (x) <t> n (x) dx 

— dm — dm 

— 0 m — 1, 2, 3, . . . 

Hence, g(x) is orthogonal to every one of the Therefore, since the s form a 
complete set, g(x) must be a null function, and the theorem is established. 

Closely associated with the concept of completeness is the 
concept of closure* described in the following definitions: 

DEFINITION 6 
rb 

If lim I [f(x) — $„(a:)] 2 dx — 0, the sequence of functions S n (x) is said to con- 
verge in the mean to fix). 

DEFINITION 7 

If Snix) = ai<f»iix) 4- aifaix) 4- • * v4 a n <£ n ( x) is the nth partial sum of the 
expansion of fix) in terms of the members of an orthonormal set { 4> n (x ) } and if 
Snix) converges in the mean to fix) for every fix), then the set {<j> n (x) } is said to 
be closed. 

One important property of closed orthonormal sets is con- 
tained in the so-called theorem of Parseval : 


What we have called completeness some authors call closure, and vice versa. 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


319 


THEOREM 2 

If ai Ux) + a*i> ,(x) + • • • + aA n (*) + ' ' ' is the expansion of a function 
j(x) in terras of the members of a closed orthonormal set { <f> n (x ) } > then 

^ a " 2 = fa dx 

PROOF From the definition of closure, we have 


lira f* [/(*) - £ a n <t> n (x)] 2 dx = 0 

or Km f* [ (/(*))* - 2 f(x) | a,4>,(x) + { <■.*•(*)}’] dx = 0 

If we now perform the indicated integration, remembering that 
f b f(x)<i>n(x) dx = a n 

and observing that, in the integral of the last term, 

/„ *-<*)♦■(*) * - j 1 m = n 
we obtain lim j J* [/(x)F dx - 2 ^ a„ a 4- ^ °» ! } - 0 


or 


X °*’ = fa 

As an immediate consequence 
the following important result: 


as asserted, 
of the last theorem, we have 


THEOREM 3 

A closed orthonormal system { 4> n (x) } is also complete. 

PROOF To prove this, let us suppose that the closed orthonormal system 
{ AJx ) } is not complete. This implies that there is at least one nonnull function J W 
which is orthogonal to each of the <j>’s and which, therefore, has the proper y & 
every coefficient in its expansion in terms of the <j>’s is zero. However, since • e 
set { <j> n (x ) } is closed, we have, from Parseval’s theorem, 

j b fix) dx - £ 

Hence, since each a n is zero, as we have just observed, it follows that f(%) is a null 
function, contrary to our assumption. This contradiction forces us to abandon the 
supposition that the closed set { <t>«(x) } is incomplete, and the theorem is established. 

The converse of Theorem 3 is also true, but the proof of this 

fact is difficult, and we shall not attempt it. , , 

A great deal of important advanced mathematics deals with 
the properties of special orthogonal systems and with the va 1 1 y 
of the formal expansion we have just created. In the next chapter 


320 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. & 


we shall examine in some detail two such systems, namely, the 
Bessel functions and the Legendre polynomials. Questions con- 
cerning the convergence of the generalized Fourier series (5), how- 
ever, we shall not discuss, and in our work we shall assume not 
only that all the expansions we obtain converge but also that they 
actually represent the functions which generated them. 

Orthogonal functions arise naturally and inevitably in many 
types of problems in pure and applied mathematics.* Their 
existence in problems such as we have been considering is guaran- 
teed by the following beautiful and important theorem :t 


THEOREM 4 

Given the differential equation 

^^- , + lsW+Xp(x)l!/=0 

where r(x) and p(x) are continuous on the closed interval a 5= x b and q(x) is 
continuous at least over the open interval a < x < b. If Xi, X 2 , X 3 , ... are the 
values of the parameter X for which there exist solutions of this equation possessing 
continuous first derivatives and satisfying the boundary conditions 


«i yia) - aty'(a) - 0 
biy(b) - hy'(b) = 0 


where cii, a it b i, & 2 are any constants such that a\ and a 2 are not both zero and hi 
and & 2 are not both zero, and if y h y 3 , y 3 , , , . are the solutions corresponding to 
these values of X, then the functions {y n (x ) } form a system orthogonal with respect 
to the weight function p(x) over the interval (a,&). 

PROOF To prove this, let y m and y n be the solutions associated with two 
distinct values of X, say X m and X„. This means that 

^ + ( 8 + *, P ) a „ = 0 

+ (« + x.p)!/. = o 

Now multiply the first of these equations by y n and the second by y m and then 
subtract the second equation from the first. The result, after transposing, is 


(X» - b n )py m y„ = y, 


d(ry' n ) 

dx 


~ Vn 


d(ry' m ) 

dx 


or, integrating between a and b, 

(X»i — X„) j[ 6 py n y n dx = 



* It is interesting and instructive in this connection to reread the discussion 
of orthogonal polynomials in Sec. 4.6 and to refer to the discussion of the 
orthogonality of vectors in Sec. 10.4. 

t This theorem and the boundary value problem with which it deals are 
usually associated with the names of the Swiss mathematician J. C. F. Sturm 
(1803-1855) and the French mathematician Joseph Liouville (1809-1882). 


SEC. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


321 


If we can prove that the integral on the left vanishes whenever m and n are differ- 
ent, we shall have established the orthogonality property of the functions of the 
set {y n (x)}. This we shall prove by showing that the right-hand side of the last 
equation is zero. To do this we begin by integrating the terms on the right by parts : 



» ry m y' 

dv = dCrj/r.') J dn \a 


dv = d(rum') 
v = ry m ‘ 


ryny' m 


ry' n y'm dx 
£ ry'mVn dx 


When we subtract these expressions, the integrals which remain on the right 
cancel, and we have 

(6) f [*■ ix ~ f! [»* ] ix - - vm |‘ 

Now y m and y n are not merely solutions of the given differential equation. For 
every m and n, they also satisfy the boundary conditions 


ai y(a) = aiy'(a) and biy(b) = b z y'(b) 


Substituting for y' (a) and y'{b) from these expressions into the evaluated anti- 
derivative in (6), we obtain 

= r(b) [ ~y m (b) y m (b)y n (b) j 

- r(a) y»(a) - ^ y m (a)y n (a) J ■ 0 

If a 2 or b Z) or both, should be zero, this result can still be established by substitut- 
ing for y(a) or y(b), or both, instead of for their derivatives, since ai and a 2 cannot 
both vanish nor can bi and b z . Moreover, if r(a) = 0, then the first boundary 
condition becomes irrelevant; that is, the integrated terms vanish at x — a with- 
out the need of any condition on the solutions y m and y n . Likewise, if r(b) = 0, 
the second boundary condition is irrelevant. We have thus shown that under the 
conditions of the theorem 

(Am - X») f* py m y n dx = 0 

Since A m and A n were any two distinct values of X, the difference A m — X„ cannot 
vanish. Hence 

f* py m y n dx = 0 
and the theorem is established. 

In each of the torsional vibration problems we considered in 
Sec. 8.4, the functions in terms of which we had to expand the 
initial conditions satisfied a differential equation and a set of 
boundary conditions included under Theorem 4. This, and not 
the coincidental fact that Fourier series were involved, explains 
why the final expansion could be carried out in each case. 


322 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


EXAMPLE 1 (continued) 

When we left Example 1 in order to develop the theory necessary to complete its solution, we 
were faced with the necessity of expanding the initial temperature u(a :,0) — f(x) in a series of 
the form (4), 

f(x) = £ B n sin fax 


where the functions {sin X„cc) were the solutions of the differential equation 
X" + faX = 0 
which satisfied the conditions 

X(0) - 0 
hX{l) + X'(l) = 0 

But this equation and the accompanying boundary conditions are in all respects a special case 
covered by Theorem 4. In fact, with X 2 written in place of X, we have 

r{%) *=* 1 q(x) — 0 p(x) = 1 
a = 0 6 = Z 

ax = 1. a s - 0 fa = h fa = — 1 

Hence, by the last theorem, the functions {sin X„a;} form a set orthogonal with respect to the 
weight function p(x) = 1 on the interval (0,1). 

To determine B n we now multiply Eq. (4) by sin X n x and formally integrate from 0 to 1. 
Because of the orthogonality of the functions sin fax, every integral on the right vanishes except 
the one whose integrand contains sin 2 X«x. Therefore, 

f f(x) sin fax dx 

B »=Ci 

/ sm 2 fax dx 

or, evaluating the integral in the denominator and recalling that z» e fal satisfies the equation 
sin z n = —az n cos z n> 

2 n 

B n - — - / f(x) sin fax dx 

1(1 + a COS 2 Zn) JO 

With B„ determined, the formal solution is now complete. 

Problems involving second-order differential equations are 
not the only ones in which orthogonal functions arise. In par- 
ticular, we have the following important theorem covering fourth- 
order systems, of which the vibrating beam is a special case. 


THEOREM 5 

Given the differential equation 

+ [a(a:) + XpWte . o 

where r(x) and p(x) are continuous on the closed interval (a, b) and q{x) is con- 
tinuous at least on the open interval (a, 6). If Xi, fa, fa, . . . are the values of 
the parameter X for which there exist solutions of this equation possessing con- 


Sic. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


323 


tinuous third derivatives and satisfying the boundary conditions 

aiy(a) — «i (ry")' | a = 0 a 2 y'(a) — oc 2 (ry") | a = 0 

hy(fi) - Pi(ry")' | fe = 0 b 2 y , (b) - p 2 (ry") = 0 

where neither a,- and a; nor &*- and /3» are both zero, and if y h y h y h . . . are the 

solutions corresponding to these values of \, then the functions {y n (x)\ form a 

system orthogonal with respect to the weight function p(x) over the interval (a,b). 


EXAMPLE 2 


A uniform cantilever of length l begins to vibrate with initial displacement y(x, 0) = f{x) and 
dy I 

initial velocity — I ■* g(x). Find its displacement at any point at any subsequent time. 

For definiteness let us assume that the built-in end of the beam is at the origin. Then, since 
the beam is of uniform cross section and bears no external load, we have to solve Eq. (11), 
Sec. 8.2, 

„ a 1 ?/ d*y 

a * dx 4 “ a / 2 


subject to the boundary conditions 


2/(0,/) = 0 


£ 2/1 

dx |0,t 


- 0 


i.e., the displacement at the built-in end is zero 
i.e., the slope at the built-in end is zero 


£2/ I 

8a: 2 \l,t 


- 0 


&y I 
dx 3 ki 


0 


i.e., the shear 



at the free end is zero 


As usual, we begin by assuming a product solution y(x,t) = X(x)T(t), substituting it into 
the given partial differential equation, getting a?X tv T = —XT", and then separating variables, 

. X 1V _ _ 


Since x and t are independent variables, these two fractions must have a common constant 
value, say y. If y 2s 0, the solution for T cannot be periodic, as we know it must be to represent 
undamped vibrations. Hence we restrict y to be positive,* and write y - X. 2 . This leads to the 
component differential equations 

T" - -\*T and X" = -X 
a 2 


and the respective solutions 


(7) 

(8) 


X = C cos ^ J ^ x + D sin + E cosh + Fsinh ^ J ^ a: 


* In vibration problems where it is clear that only periodic solutions are 
possible, engineers often take their initial assumption to be 
y(x,t) = X (x) (A cos X/ + B sin Xf) 

as, in effect, we did in Example 3, Sec. 5.5, in studying the undamped vibra- 
tions of an electrical network with only a finite number of degrees of freedom. 


324 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


Imposing the first boundary condition, namely, 

y(Q,t) » X(fi)T(t) = 0 
we find (C + E)T{t) = 0 

Since we cannot permit Tit) to be identically zero without having the entire solution become 
trivial, we conclude that 

C + E - 0 

Imposing the second boundary condition, namely, 


dy | 
dx 10, t 


X'(0)T(t) 


0 


we find (D + F) - T{t) - 0. Hence, D + F » 0. 

From the third and fourth boundary conditions 

— 7 I « X"(l)T(t) - 0 and — J ® X"'(l)T(i) - 0 

dx 2 |Z,« dx 3 IM 

we obtain, respectively, 



Hence, for convenience, setting 


— D sin l + E cosh l + F sinh l^j ~ T{t) = 0 
D cos l + E sinh l + F cosh ^ J ^ ^ T (0 = 0 


(9) 

we must have — C cos z — D sin z + I? cosh z + F sinh z = 0 

C sin z — I> cos z + E sinh z + F cosh z — 0 

If we eliminate C and D from these equations by using the conditions 
C + E - 0 and D + F = 0 
we obtain the system 

(10) E (cosh z + cos z) + F(sinh z + sin z) = 0 

E (sinh z — sin z) + F(cosh z + cos z) - 0 

These equations will have a nontrivial solution for E and F if and only if the determinant of the 
coefficients is equal to zero. Hence we must have 

I cosh Z + cos z sinh z + sin z I 

| sinh z — sin z cosh z + cos z | 

or, expanding and simplifying, 

cosh z cos z = — 1 

The existence of infinitely many roots of this equation, i.e., cos z = — 1/cosh z, can be 
inferred from Fig. 8.13, where the graphs of 


are plotted. 


y = cos 


SEC. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


325 


FIGURE 8.13 

Plot showing 
the graphical 



equation 

cos z = — l/eosh z. 

j cosh z 

■ ' \ ; 


From these roots z l} z 2 , z 3 , ... , we can find the relevant values of X at once from Eq. (9) : 



When z has any one of the values z\, z 2 , z 3 , . . . , the equations (10) become dependent, and 
we can write either 

E _ sinli z + sin z E cosh z + cos z 

F cosh z + cos z F sinh z — sin z 

as we choose. Using the former, we have 


E n — —C n = — (sinh z„ + sin z n )K n 
F n = —D n = (cosh z„ + cos z n )K n 
where K„ is an arbitrary constant. Therefore, substituting into Eq. (8), 


A r „(a:) = C„ cos x + D n sin x + E n cosh x + F n sinh x 

— A'„(sinh z„ + sin z„) ^cos z„ j — cosh z n ~^ 

— An (cosh z„ + cos z„) ^sin z n ~ — sinh Znj'j 

Hence, absorbing K„ in the coefficients A n and B„ in the expression (7) for T n and redefining X„ 
to be the completely determined function 

(11) X n (x) = (sinh z n + sin z„) ^cos z„ ~ — cosh z n 

— (cosh z n + cos s n ) ^sin z B ~ — sinh Zn 


we have, as the formal solution of the partial differential equation which meets the four given 
boundary conditions, 

y(x,t) = 2 Xn(*)Tn(t) = ^ X nix) (An COS X„i + B n sin X„f) 

To satisfy the initial displacement condition, we must have 

( 12 ) ?/(*, 0 ) = /(*) = £ AnXnM 

» = 1 

and to satisfy the initial velocity condition, we must have 


326 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


Thus, again, to satisfy the initial conditions we must be able to expand an arbitrary function 
in an infinite series of known functions, in this case the functions of the set |X«(x) } defined by 
Eq. (11). These bear little or no resemblance to the terms of a Fourier series, but the required 
expansions can easily be carried out using the orthogonality of the X„’s, which is guaranteed by 
Theorem 5 (and of course their completeness, which as usual we must assume). In fact, with 
\ 2 /a- written in place of X, our problem is just the special case of Theorem 5 for which 

r(x) - 1 q(x) = 0 p{x) = 1 
a = 0 b = l 

ffli = 1 on = 0 bi — 0 fit = — 1 
o 2 = 1 « 2 = 0 b t = 0 Pi = -1 

Hence, the functions of the set {X„(a:)} are orthogonal with respect to the weight function 
p(x) — 1 over the interval {0,1). 

With the orthogonality of the X»’s now established, we can determine A n and B n imme- 
diately by multiplying Eqs. (12) and (13) by X n (x) and integrating from 0 to l. The results are 

f l f(x)X n (z) dx C g(x)X n (x) dz 

A n = and B n - j 

I XA(X) dx \ n / XA(X) dX 

We are now in a position to summarize the main features 
of a simple boundary value problem. By assuming that solutions 
for the dependent variable exist in the form of products of func- 
tions of the respective independent variables, the original differ- 
ential equation is broken down into Several ordinary differential 
equations, each of which involves a parameter X which ranges 
over a continuous infinity of values. 

When the boundary conditions of the problem are imposed 
upon the product solutions obtained by. solving the component 
ordinary differential equations, it is necessary, in order to avoid 
solutions which are identically zero, that the parameter X satisfy 
a certain equation. This equation is known as the characteristic 
equation of the problem, and its roots, in general infinite in 
number, are known as the characteristic values or eigenvalues 
or Eigenwerte* of the problem. Only for them can solutions be 
found satisfying both the partial differential equation and the 
given boundary conditions. In a vibration problem, the char- 
acteristic values determine the natural frequencies of the system, 
and the characteristic equation is therefore usually called the 
frequency equation. The solutions which correspond to the 
respective characteristic values are known as the characteristic 
functions or eigenfunctions of the problem. In a vibration prob- 
lem, they are usually called the normal modes, since they define 
the relative amplitudes of the extreme positions between which 
the system oscillates when it is vibrating at a single natural 
frequency, i.e., in a “normal” manner. 

To satisfy the initial conditions of the problem it is necessary 
to be able to express an arbitrary function as an infinite series 


* German for characteristic values. 


SEC. 8.5 


ORTHOGONAL FUNCTIONS AND THE GENERAL EXPANSION PROBLEM 


327 


of the characteristic functions of the problem. This can he done 
in most cases of interest because under very general conditions 
the characteristic functions of a boundary value problem form 
an orthogonal set over the particular interval related to the 
problem. 

EXERCISES 

1 If r(a ) = r(b), show that the conclusion of Theorem 4 follows equally well if the boundary 
conditions are of the form 

y(a) = y(b) and y'(a) = t/(b) 

2 Verify by direct integration that the characteristic functions of Example 1 are orthogonal 
over the interval (0,Z). 

3 In Example 1. if « = 1, compute the values of Z\, zt, and z 3 , and determine the first three 
coefficients in the expansion of the initial condition u(x, 0) = fix) — x. 

4 In Example 1, if the left end of the rod is perfectly insulated, determine the characteristic 
equation, show that it has infinitely many roots, and prove that z n +i — z„ approaches ir as 
n becomes infinite. 

6 Find the temperature u{x,t) in a slender rod of length Z whose curved surface and left end 
are perfectly insulated and whose right end radiates freely into air of constant temperature 
0° if the rod is initially at the temperature 100° throughout. 

6 In Example 1, if both ends of the rod radiate freely into air of constant temperature 0°, 
determine the characteristic equation, show that it has infinitely many roots, and prove 
that z„+i — z„ approaches tt as n becomes infinite. 

7 A slender rod of length Z has its curved surface and left end perfectly insulated. Its right- 
end radiates freely into air of constant temperature 70°. Initially the temperature through- 
out the rod is 100°. Find the temperature of the rod as a function of x and t, (Hint: Let 
U — u — 70 be the dependent variable.) 

8 Work Example 1 with the left end of the rod maintained at the constant temperature 100°. 

9 Prove Theorem 5. 

10 Prove that the general linear second-order differential equation 

pa(x)y" -f pi(x)y' -f pi(x)y =' \y 

can be reduced to an equation of the Sturm-Liouville form (see Theorem 4) by multiplying 
it by the factor 

1 J x * [pi(*)/po(a0]d* 

Po(x) 

11 Find the frequency equation and the normal modes for the transverse vibration of a uniform 
beam whose ends are 

a Hinged-hinged (Hint: A hinged end is one where a beam, though constrained so it can- 
not deflect, is still free to turn, i.e., an end where both the displacement and the moment 
are zero at all times. A hinged end is often referred to as a simply supported end.) 
b Fixed-fixed c Free-free 

d Fixed-hinged e Free-hinged 

12 Find the frequency equation for the transverse vibrations of a uniform cantilever bearing 
a concentrated mass at the free end. (Hint: At the free end one boundary condition is that 
the shear, instead of being zero, is equal to the inertia force of the attached mass.) 

13 Find the frequency equation for the transverse vibrations of a uniform hinged-hinged beam 
bearing a concentrated mass at its mid-point. 

14 Find the frequency equation for a uniform torsional cantilever if a disk of polar moment of 


328 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


| 

1 

I 

1 


inertia /„ is attached to the free end of the shaft. (Hint: At the free end of the shaft the 
boundary condition is that the torque, instead of being zero, is equal to the inertia torque 
of the disk.) 

16 Show that the normal modes in Exercise 14 are not orthogonal. 

16 Find the frequency equation for the torsional vibrations of a shaft of length 21 which is 
clamped at x = 0 and free at x = 21 if the radius of the portion of the shaft between x = 0 
and x = l is ri and the radius of the portion of the shaft between x = l and x = 21 is 7- 2 . 
(Hint: Solve the problem separately for the interval (0,1) and the interval (1,21) and then, 
in addition to the two end conditions, impose the conditions that at x — l both the angle of 
twist and the transmitted torque are continuous.] 

17 Show that the system {cos nx), n = 0 , 1, 2 , 3 , . . . , is orthogonal but not complete over 
the interval (— ir,ir). 

18 Show that for an orthonormal system {<£ n (a:)), whether closed or not, we have Bessel’s 
inequality 

£ £ J^ imYdx 

where the a’s are the coefficients in the generalized Fourier expansion of f(x) in terms of the 
<t >' s and (a,b) is the interval of orthogonality. Using this result, show that 
rb 

lim / 4>n(x)f(x) dx = 0 

19 If { 4> n (x) j is an orthonormal set over the interval (a,b), show that the values of the c’s 
which make 

Z b 

[f(x) - Ci<j>i(x) — c a <£«(x) — — c n <£ n (x)] 2 dx 

a minimum are the corresponding coefficients in the generalized Fourier expansion of f(x) 
in terms of the rf>’s. 

20 What is the minimum value of the integral in Exercise 19? 


Further applications 

Many problems in partial differential equations involve features 
not found in the simple examples we have used to elaborate the 
standard, elementary theory. We eannot here investigate in 
detail the variations and extensions of this theory, but, as illustra- 
tions, we shall present several additional examples exhibiting 
techniques of practical interest. In the first example, we shall see 
how the analysis of the forced vibrations of a continuous system 
leads to a nonhomogeneous rather than a homogeneous partial 
differential equation. In the second, we shall see how Fourier 
integrals, rather than Fourier series, enter into problems where 
the boundary conditions fail to provide a characteristic equation 
■ and X remains a continuous parameter. In the third, we shall see 
that, though a partial differential equation may be separable, it 
may be impossible to make its product solutions fit the boundary 
conditions and so other methods must be used to solve it. In the 


SEC. 8.6 


FURTHER APPLICATIONS 


329 


fourth, we shall see how a partial differential equation involving 
three rather than two independent variables leads to a double 
series of characteristic functions and two separate expansion 
problems. The important matter of the application of Laplace 
transform methods to the solution of partial differential equations 
we shall consider in the next section. 


EXAMPLE 1 

A uniform string of length l is acted upon by a distributed periodic force 


f(x,l) — — 4>(x) sin coL 

g 

If the string is initially at rest in its equilibrium position, determine its subsequent motion given 
that frictional effects are negligible. 

From Eq. (2), Sec. 8.2, it is clear that the deflection of the string satisfies the partial differ- 
ential equation 

<9 2 ?/ c? 2 w 

(1) ~ - a 2 — + sin cot 

at - ax- 


As in the case of a system with a single degree of freedom, the motion of the string consists of 
two parts, one described by the solution of the homogeneous equation 


(2) 


Bf- dx 8 


and the other described by a particular solution corresponding to the nonhomogeneous term 
0 Or) sin cot. 

To find the solution of the homogeneous equation (2), that is, to determine the free motion 
of the string, we assume a product solution 
y„(x,t) = X(x)T(t) 


and proceed exactly as we did in solving the wave equation for the torsional vibrations of a 
fixed-fixed shaft of uniform cross section in Sec. 8.4. The result is 


V . nnx ( , niral „ . nirat\ 

(3) yn(x,t) ~ l sm — y An eos ~~i — Bn sra ~~T~) 

To find a particular solution of the nonhomogeneous equation (1), we observe that from 
physical considerations, the motion produced by the applied force must be periodic with the 
same period as the force. Moreover, since the system is assumed to be frictionless, it is clear that 
the motion of the string must be in phase with the force. Hence it is reasonable to assume a solu- 
tion of the form 

Y(x,t) — 4>(.r) sin cot 

We can now proceed in either of two ways. In the first place, we can substitute Y(jx,t) into the 
nonhomogeneous equation (1), divide out the common factor sin cot, solve the resulting non- 
homogeneous ordinary differential equation, namely, 

— = a 2 'I>" + 4>{x) 

and impose upon it the boundary conditions that 
Y(0,i) = Y(l,t ) = 0 
i.e., the conditions that 

<f>(0) = HD = 0 



330 PARTIAL DIFFERENTIAL EQUATIONS CHAP. 8 

When <t>(a:) has been determined so that these conditions are fulfilled, we can then construct the 
complete solution 

(4) y(x,t) = y H (x,t) + Y(x,t) 


V . nvx ( , rival . nvai\ 

= .Jh- sm — l cos "y ■ Bn sm -y I ‘ 


The initial conditions can now be imposed, giving 

(5) y(x,0) = 0 = ^ A n sin —■ 
and 

(6) y(x,0 ) = 0 - ^ sin + ca$(x) 

From (5) we conclude that the j4’s are the coefficients in the half-range sine expansion of 0; 

hence, A n = 0 for all values of n. From (6) we conclude, similarly, that the terms B n | are 

the coefficients in the half-range sine expansion of — to$(x); hence, 

2« fl nvx 

B n / $0*0 sin — — - dx 

ma Jo l 

provided that w is not equal to one of the natural frequencies {w n j — {nva/l\ of the system, 
i.e., provided that the system is not being “driven” at resonance. 

On the other hand, before substituting Y(x,t) — $(a0 sin tat into the nonhomogeneous equa- 
tion (1), we can expand <t>{x) into a half-range sine series, getting, say, 


Then, assuming for $(a;) a half-range sine expansion with undetermined coefficients, say 


) - £ D n sin^ 


we have, on substituting 


■(| d -^Y) b 


into Eq. (1) and then dividing out sin cat, 


. V _ . nvx „ V' _ AisA 2 . nvx W _ . nvx 

l D ‘ mt ~ “ l M T ) ™~r + l C.M — 

n=X »=1 ' ' n=l 


Making this relation an identity by equating to zero the coefficient of s 
we find 

_ C„ Cn 


„ V C» nvx 
$(a0 = > sm — — ■ 

„=l " n ~ “ 1 

Y{x,t) = sin cat - ^ ^ „Tl ut sin ~r) S: 


Hence, 


SEC. 8.6 


FURTHER APPLICATIONS 


331 


and the complete formal solution of the nonhomogeneous equation becomes 

. V ■ nirX { a n7rat mral\ , / V C'n . nvx\ 

y(x,t) = £ sin — cos — f B„ sin ~j~ J + ^ 2/ ■ 3 _ sin — J s 

V - nwX ( . nvat „ . nirat C n \ 

‘ l t am — + jr; “> j 

To satisfy the initial conditions, we must have 
y(x,0) = 0 = ^ A < 

n=l 

whence, A n = 0; and 


mrx 

~T 


y(.x,0) = 0 = ^ sin ^y ^y B n + ~ 


whence, 


l 


B n = 


mra(w 2 — w„ 2 ) 


From the expression for B„ it is clear that the frequency w of the impressed force must not 
coincide with any natural frequency co„ of the string unless the corresponding coefficient C n is 
equal to zero, that is, unless the term 

. nvx . x 

sm —5— a sin - 

l a 

is missing from the half-range sine expansion of ${x). If « = w„ and C„ ^ 0 for some particular 
value of n, then the string is effectively being driven at a condition of resonance, and displace- 
ments of arbitrarily large amplitudes will be built up. (See Exercise 1.) 


EXAMPLE 2 

A slender rod whose curved surface is perfectly insulated stretches from x = 0 to x = ®o . Find 
the temperature in the rod as a function of x and t if the left end of the rod is maintained at the 
constant temperature 0° and if initially the temperature along the rod is given by u(x, 0) = f(x). 
Exactly as in Example 1, Sec. 8.5, we find that the function 

u — Be~ Xs ‘ ,a<> sin \x 
satisfies the heat equation 

3 2 u _ a du 
dx* ~ ° d t 

and the boundary condition at the left end of the rod, 

«(O,0 = 0 

Lacking a second boundary condition, however, we have no further restriction on A. Therefore, 
instead of having an infinite set of discrete characteristic values A„, with corresponding solutions 
u n (x,t) = sin X«x 

we have a continuous family of solutions 

u\(x,t) = B(A)e -XSf/ ° 5 sin \x 

where the arbitrary constant B is now associated not with n, but with the continuous parameter 
X, which can assume any real value. 

We cannot speak of an infinite series of particular solutions in this case. Instead of adding 


332 


partial differential equations 


CHAP. 8 


the product solutions for each value of n we therefore try integrating them over all values of X, 

(7) n(x,t) = J B(X)e _X2 ' ,n! sin \x d\ 

By direct substitution it is easily verified that this integral is a solution of the heat equation. 

It is now necessary to impose the initial condition u(x,Q) — j(x) on the solution u{x,t). 
Setting t — 0 in Eq. (7), it is clear that this requires that 

f{x) = J B{\)mn\xd\ 

But this is just an instance of the Fourier integral we considered in Sec. 6.7. There, in discussing 
what we called Fourier sine integrals, we saw [Eq. (15a), Sec. 6.7] that if 

fix) = J\ B(X) sin X® d\ 
then the coefficient function J5(X) is given by 
B(X) -~J Q /Or) sin Xx dx 

Introducing the dummy variable s for x in the integral defining S(X), wc can, therefore, write. 
Eq. (7) in the form 

uix,t) = J e -x*t/ai jf a i n xs ds j sin \x d\ 

- - J e -x*«a» fi s ) s in xs sin \x ds dX 

which is the required solution. 

EXAMPLE 3 

Find the steady-state potential at any point of an infinitely long transmission line if a signal 
voltage Eq cos ut is applied at the sending end j = 0, 

Here we have to solve the telephone equation 

(8) S“ ic S+<* c +<®>ii+ se ' 

subject to the boundary conditions 

(9) e(0,l) = Eq cos ul eix.i) bounded as x -* <*> 

If we assume a product solution e(*,<) = Z(®)!T(<) and separate variables, we obtain 
X" LCT" + iRG + GL)T’ + RGT 
X T p 

Thus the factor T satisfies the equation 

LCT" + (RC + GL)T' + (RG - y.)T = 0 
and hence T must be of one or the other of the forms 
e p< (d cos qt + B sin qt) 
e*iAt + B) 

Ae p d + Be*d 

Under no circumstances can the last two expressions represent periodic behavior. Moreover, 
the first expression can represent periodic behavior only if p = 0, which is impossible, since 
V = —iRC + GL)/2LC ?£ 0. Hence, no product solution of Eq. (8) is capable of describing 
what we know the steady-state behavior of the line must be. 


SEC. 8.6 


FURTHER APPLICATIONS 


333 


If we reconsider the problem, in an attempt to find an alternative method of solution, it 
seems reasonable to expect that, under the given conditions, the voltage along the line will vary 
harmonically with time while exhibiting attenuation and phase shift depending on the distance 
from the sending end. Hence we are led to try an expression of the form 

(10) e(x,t) = E a e~ ax cos (wf + bx) 

If a > 0, this obviously satisfies each of the boundary conditions (9), and perhaps the constants 
a and b can be determined so that it will satisfy the differential equation also. 

If we substitute the tentative solution (10) into the telephone equation (8), divide out 
Eu(r a ~, and collect terms, without difficulty we obtain 

[a 2 - b 2 + LCw 2 - RG] cos (wf + bx) + [2af> + a(RC + GL)\ sin (at + bx) - 0 
This will be an identity if and only if 

(11) a 2 - 6 s = RG - LCa 2 

(12) 2 ab = ~(RC + GL)w 

Now by adding the square of Eq. (12) to the square of Eq. (11), we obtain 
(a 2 + b 2 ) 2 = (RG - LCw 2 ) 2 + (RC + <?L) 2 w 2 
or 

(13) a 2 + b 2 - V(RG - LCw 2 ) 8 + (RC + <?L) 2 w 2 
Finally, by solving (11) and (13) simultaneously, we find 

a 2 - l AlV(RG - LCw 2 ) 2 + (RC + CL)*w 3 + (RG - LCw 2 )] 
b 2 = HlViRG - LCw 2 ) 2 + (RC + GL)W - (RG - LCw 2 )] 

From the form of these equations it is clear that a 2 and b 2 are positive. Hence a and b are real, 
and, with their values now determined, Eq. (10) becomes the required solution. In a similar 
manner, of course, the steady-state response to a signal voltage of the form E o sin w t can be 
found. 

By means of these results it is now possible to find the steady-state voltage corresponding 
to any periodic signal voltage; for if e(0,t) = f(t) is a periodic function with period 2 p, then it can 
be expanded in a Fourier series, 

a 0 , xf , 2 irt . iri . 2nt 

f(t) = — + ai cos i- aa cos b • • • + bi sm h t >2 sm (- • • • 

2 V V V V 

and the steady-state solution for each of these terms can be found. Then, since the telephone 
equation is linear, the sum of the steady-state responses to each of these terms will be the steady- 
state response of the line to the entire signal /(f). Moreover, if the input signal is not periodic, 
the steady-state solution can still be found by a similar analysis using the Fourier integral 
rather than a Fourier series. 

EXAMPLE 4 

A very thin sheet of metal coincides with the square in the a: 2 /-plane whose vertices are the 
points (0,0), (1,0), (1,1), and (0,1). The upper and lower faces of the sheet are perfectly insulated, 
so that heat flow in it is purely two-dimensional. Initially, the temperature distribution in the 
sheet is u(x,y,0) = f(x,y). If there are no sources of heat in the sheet, find the temperature at 
any point at any subsequent time, given that the edges parallel to the x-axis are perfectly insu- 
lated and that the edges parallel to the y-axis are maintained at the constant temperature 0°. 
Here we have to solve the two-dimensional form of the heat equation [Eq. (14), Sec. 8.2], 

d 2 u d 2 u du 

1 = a 2 — 

dx 2 dy 2 dt 


( 14 ) 


334 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


subject to the boundary conditions 


(15) 

u(0,y,t) = 0 

u(l,y,t) = 0 


du 1 

du 1 

(16) 

— =0 

— - 0 

dy 

dy M.t 


and the initial condition 

(17) u(x,y,0) = f(x,y) 

Because we now have three independent variables, we begin with a product solution of the 

form 


(18) u(x,y,t) = X(x)Y(y)T(t) 

Then substituting this into Eq. (14) and attempting to separate variables, we get 

El - ill _ 111 

X~ a *T Y 


(19) 


Although . y and t enter together on the right-hand side of (19), they are both independent 
of x, and so each side of the equation must be a constant, say y. Thus the factor X satisfies 
the equation 

X" = yX 

If y > 0, say y. — X 2 , we have 

X — A cosh \x + B sinh \x 

But from the first of the boundary conditions (15), namely, 

X(Q)Y(y)T(l) - AY(y)T(t) - 0 

it follows that A = 0. Likewise, from the second of the conditions (15), namely, 
u(l, y,t) m X(l )Y(y)T(t) - (B sinh \)Y(y)T(t) = 0 
it follows that 5 = 0. Hence, when y > 0, the factor X{x) vanishes identically, and only a 
trivial solution is possible. 

If n — 0, we have 

X - Ax + B 

and again the boundary conditions (15) can be satisfied only if A = B = 0. 

Finally, if n < 0, say n — —X 2 , we have 

X = A cos \x + B sin X® 

From the first of the boundary conditions (15) we conclude that A ~ 0. The second condition 
requires that (B sin \)Y{y)T(t) — 0 and, since we cannot permit B to be zero, we must have 
sin X = 0 and X = mir m = 1, 2, 3, . . . 

Therefore, 

(20) X m (x) = sin vncx m = 1 , 2, 3, . . . 

Continuing with the other equation arising from (19), we now have 



or 

Y" T' 

(21) y =a 2 - + mV 2 

Since y and t are also independent, each member of the last equation must be a constant, say n- 
Thus, the factor Y satisfies the equation 
Y" = V Y 


SEC. 8.6 


FURTHER APPLICATIONS 


335 


If i? > 0, say ij = v 2 , we have 

Y = C cosh vy -f D sinh vy 

and Y' — vC sinh vy + vD cosh vy 

But, from the first of the boundary conditions ( 16), namely, 

"I n , - X(x)Y'{Q)m - X(x)(vD)T(t) = 0 
ay 

it follows that D — 0. Likewise, from the second of the conditions (16), namely, 

— I = X(x)F'(l)T(«) = X(x)(vC sinh v)T(t) - 0 
dy 

it follows that C — 0. Hence, when y > 0, the factor Y(y) vanishes identically, and only a 
trivial solution is possible. 

If y — 0, we have 

Y — Cy + D 

and this time the boundary conditions (16) require that C — 0 but do not restrict D. Hence, 
Y — D is a possible solution for the factor Y. 

Finally, if y < 0, say y = — v 2 , we have 

Y — C cos vy + D sin vy 

and Y' — — vC sin vy -\- vD cos vy 

From the first of the conditions (16) we conclude again that D — 0. The second of the conditions 
(16) requires that 

X(x)( — vC sin. v)T(t) = 0 
and, since we cannot permit (7 = 0, we must have 

sin v = 0 and v - nv n — 1, 2, 3, . . . 

Therefore, Y n (y) = cos nrry n - 1, 2, 3, . . . 

or, including the solution Y = constant obtained when y = 0, 

(22) Y n (y) = coa nxy n = 0, 1, 2, 3, . . . 

From (21) it is now clear that the factor T satisfies the equation 

T , _ _ (w a + n 2 ) tt 8 T 

a 8 

and, hence, that 

(23) T = 

Therefore, combining (20) and (22) with (23), we can write the product solution ( 18) explicitly: 

(24) u mu (x,y,t) = E mn sin mwx cos niry c -( 0 »*+n a >ir=/« 2 ]t 

None of the product solutions (24), by itself, can reduce to the required initial temperature 
distribution (17). Hence, we must form a series of these and attempt to make this series satisfy 
the initial temperature condition. But now, since we have two independent parameters m and n 
in the product solutions, the general solution for u will be & double series: 


u(x,y,l) = y u m n(x,y,t) = V V E mn sin mirx cos nmj e l (m»+n*)TWj! 
m,n n — 0m — l 

When t = 0, this must reduce to f(x,y); that is, 

f(x,y) = J ( X E mn sin rnirx) cos niry 


(25) 


336 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


Now the inner summation in (25) is a function only of n and x, say G n (x), and, hence, (25) can 
be written 

f(x,y) = Y, Gn ^ cos n7t y 

n=0 


But, for any particular value of x, this is just the Fourier half-range cosine expansion of f(x,y), 
thought of now as a function of y for 0 g y g 1. Hence, by familiar theory, we can write 

(26) G n (x ) = 2 jf 1 f{x,y) cos niry dy 
But, by definition, 

G n (x) — £ Emn sin ItllCX 

and this is just the half-range sine expansion of the now known function £?„(x) for 0 g x g 1. 
Hence, 

(27) E mn - 2 jf 1 G n (x) sin mwx dx 

If we wish, we can substitute for (?„(x) from (26) into (27), getting 


E mn = 2 j |^2 f(x, y) cos niry cb/j sin rmrx dx 
- * l' £ cos niry sin niirx dy dx 

With ,E m n determined for all values of m and n, the formal solution is now complete. 


EXERCISES 


1 


3 


Can the procedure illustrated in Example 1 be modified to obtain a description of the motion 
of the string when the frequency of the impressed force is one of the natural frequencies of 
the string? How? 

Can the procedure illustrated in Example 1 be modified to obtain a description of the 
motion of the string when the impressed force is of the form f(x,t) = where (a) 

0(f) is periodic? How? (b) 0(f) is not periodic? How? 

Work Example 1 with 



4 Work Example 1 with <f>(x) - x(Z — x). 

5 A uniform string of length l is acted upon by a distributed frictional force equal at each 

. cw dy 

point to — where c is an arbitrary proportionality constant. Discuss the subsequent 

g at 

motion of the string given that it begins with initial displacement y(x, 0) = g(x) and initial 
velocity y(x,0) = h(x). In particular, show that certain frequencies in the "spectrum” of 
the string may be overdamped, and determine which ones. Do the concepts of magnification 
ratio and phase shift apply to the vibration of a string with viscous damping? How? 

6 A uniform string for which o = 1 and l = ir begins to move with initial displacement 
y(x, 0) = g(x) and initial velocity r/(x,0) = h(x). Determine the subsequent motion of the 


SEC. 8.6 


FURTHER APPLICATIONS 


337 


string if it is acted upon by a distributed frictional force equal at each point to -> 

g dt 

where a c = 1 b c — 2 c e = 4 d c = 8 


7 


8 

9 


10 


11 

12 


13 


A uniform shaft of length l with both ends free, vibrating torsionally, is acted upon by a 
periodic impressed torque equal at each point to ( g/pJ)<j>(.x ) sin cut. If the initial displace- 
ment of the shaft is d{x,0) = cj(x) and if its initial angular velocity is 6(t,Q) = h(x), discuss 
the problem of determining its subsequent motion. 

Work Exercise 7 for a uniform shaft of length l fixed at x — 0 and free at x = 1. 

Find the steady-state motion produced in a uniform beam of length l which is simply 
supported at each end given that the beam is acted upon by a distributed load whose 
magnitude per unit length is x(l — x) sin cut. 

A uniform cantilever beam is built in at x = 0 and free at x = l. Find the steady-state 
motion produced in the beam by a distributed load whose magnitude per unit length is 
a - sin cut. 

Work Example 2, given that the left end of the rod is perfectly insulated. 

A slender rod of infinite length has its curved surface perfectly insulated. Find the steady- 
state temperature distribution in the rod if the temperature at the finite end of the rod 
varies according to the law w(0,t) = sin cut. Explain how this result can be used to determine 
the steady-state temperature distribution produced by an arbitrary periodic temperature 
condition at the finite end of the rod. 

A slender rod of length l has its curved surface perfectly insulated. Its right end is main- 
tained at the constant temperature u(l,t) — 0. At the left end the temperature varies 
according to the law w(0,<) = sin cut. Determine the steady-state temperature distribution 
in the rod. Explain how this result can be used to determine the steady-state temperature 
distribution produced in the rod by an arbitrary periodic temperature condition at the 
left end. [Hint: Verify that X can be chosen so that 


ui(x,t) — sin cut cos cosh X ( 


X ^2 - — cos cut sin y sinh X ^2 — 

( x\ \x ( a;\ . Xa; 

ws (x,t) = sin cut cos X I 2 — - J cosh — cos cut sin X 1 2 — - J sinh — 


are solutions of the one-dimensional heat equation. Then determine A i and A 2 so that 
u(x,t) — AiUi(x,t) + AiUi{x,t) satisfies the boundary conditions of the problem.] 

14 A slender rod of length l has its curved surface and left end perfectly insulated. Heat is 
generated within the rod at a rate per unit volume equal to <t>(x). Find the temperature in 
the rod as a function of x and t, if the right end of the rod is maintained at the constant 
temperature u (l, t) = 0 and if the initial temperature distribution in the rod is u(x,0) — g(x). 

16 If the transmission line in Example 3 is initially “dead,” i.e., if at t =» 0 the potential and 
current along the line are identically zero, determine the complete response of the line, 
transient as well as steady-state, to the signal voltage Earns cut. [Hint: Show that, if 
— p ± iq are the roots of the equation 


LCm 2 + ( RC + GL)m + ( RG + X*) = 0 
then u\ = e~ pt [A(X) cos qt + B(\) sin qt] sin Xi 

is a solution of the telephone equation which is bounded as x — > «> and is zero for all values 
of t when x = 0. Then show that the steady-state solution found in Example 3 plus the 
integral of u\ over all values of X is a solution which satisfies both boundary conditions (9). 

Finally, determine A(X) and B(X) so that both e and — are zero when 1 = 0.) 


338 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


16 Work Example 3, by replacing the signal voltage 2? 0 cos u>t by E a e i<at and assuming a 
solution of the form u{x,t) = E 0 

17 Work Example 4, given that the edges from (0,0) to (0,1) and (1,0) are maintained at the 
constant temperature 0° and the other two edges are insulated. 

18 Determine E mn in Exercise 4 for f(x,y) — x + y. 

19 A thin sheet of metal coincides with the square in the ary-plane whose vertices are the points 
(0,0), (1,0), (1,1), and (0,1). Along the edge from (0,0) to (1,0) the temperature distribution 
u(x, 0) = fix) is maintained. The other three edges are maintained at the temperature 0°. 
Find the steady-state temperature as a function of x and y. 

20 Work Exercise 19 if the boundary conditions are 


a u(x, 0) = f(x) 
b u(xfi) = u(x,l) = 0° 


du I _ du I _ du I 
dy |x,l dx |o,j/ dx I l.j/ 


du j 
dx |0,2/ 


«(i,y) = f(y) 


21 If an arbitrary temperature distribution exists along each of the edges of a square sheet of 
metal, how can the steady-state temperature distribution in the sheet be found? 

22 A thin sheet of metal bounded by the rc-axis, the lines x = 0 and x = 1, and stretching to 
infinity in the y-direction has its upper and lower faces insulated and its vertical edges 
maintained at the constant temperature 0°. Over its base the temperature distribution 
it(x,0) = 100° is maintained. Find the steady-state temperature at any point in the sheet. 

23 Work Exercise 22 if the boundary conditions are 


du j _ du | 

dx |o,i/ dx |l,i/ 

b «(0,y) = 0° 

c u(0,y) = 0° 


u(x,0) = 100° 

w(l,y) = 100° 
dlt I 
dX |1.2/ 


u(x,0) = 100.T 
Ii(x,0) = 100° 


24 Work Exercise 22, given that the left edge and the lower edge of the sheet are maintained 
at the temperature 0° and the known distribution u(l,y) = /(y) is maintained along the 
right edge. 

26 Determine the natural frequencies and nodal lines of a uniform square drumhead. 


8.7 

Laplace transform methods 

In Chap. 7 we observed how the Laplace transformation con- 
verted an ordinary, linear, constant-coefficient differential equa- 
tion into a linear algebraic equation from which the transform of 
the dependent variable could readily be found. In much the same 
way, the Laplace transformation can often be used to advantage 
in solving linear, constant-coefficient partial differential equations 
in two independent variables. In such cases it leads not to an 
algebraic equation but to an ordinary differential equation in the 
transform of the dependent variable. The general procedure is as 
follows: 

The given partial differential equation, with its accompanying 
boundary conditions and initial conditions, is transformed with 
respect to one of its independent variables, usually t. Partial 


SEC. 8.7 


LAPLACE TRANSFORM METHODS 


339 


derivatives with respect to this variable are, of course, transformed 
by the familiar formulas of Theorem 2 of See. 7.2 and its corollary. 
For partial derivatives with respect to the other independent 
variable we assume* that the operations of differentiating and 
taking the Laplace transform can be interchanged. Then, if the 
independent variables are x and t, say, we have 


£ 


f _ r» df(x,t) 
dx Jo dx 


dt = ^ dt - ~ £ { f(x,t ) } 


the derivative in the last term being a total derivative because 
£ { f(x,t) } is not a function of t. Similar formulas, of course, hold 
for x-derivatives of higher orders. Thus, the result of the trans- 
formation is an ordinary differential equation in £{f(x,t)} in 
which x is the independent variable and s enters as a parameter. 
Because s occurs in the coefficients of the differential equation, the 
arbitraiy constants appearing in its complete solution will in gen- 
eral be functions of s which must be determined by imposing the 
transformed boundary conditions on the complete solution of 
the transformed differential equation. After this has been done, 
the inverse transformation is carried out and the solution to the 
original problem is obtained. The details of this process can best 
be made clear through examples. 


EXAMPLE 1 


A semi-infinite string is initially at rest in a position coinciding with the positive half of the 
aj-axis. At t = 0, the left end of the string begins to move along the //-axis in a manner described 
by 2/(0, t) = f(t), where /(<) is a known function. Find the displacement y(x,t) of the string at any 
point at any subsequent time. 

The partial differential equation to be solved is, of course, the one-dimensional wave 
equation 


( 1 ) 


&y _ a c^2/ 
dt 2 " 0 to 8 


subject to the boundary conditions 

(2) 2/(0, o = m 

(3) y(x,t) bounded as x — > « 
and the initial conditions 


(4) 

(5) 


y(x, 0) = 0 


If we take the Laplace transform of Eq. (1) with respect to t, we obtain 


8 S £{^( »,<)} — sy{x, 0) — — = a 2 £ 


3 2 y(x,t) \ 
i dx 2 


-r£{y(x,t ) ! 


or, using the initial conditions (4) and (5), 
d 2 £jy(x,t)} 


( 6 ) 


:£{2/0M)l ~ 0 


This is justified by Theorems 3 and 6, Sec. 7.1. 


340 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


Solving this ordinary differential equation for £[?/( £,<)), we find without difficulty that 
(7) £ {&/(*,*)} = A(s)e~ (•»»>• + B(s)e^ 

To determine the coefficient functions A(s) and B(s), we observe first that, if y(x,t) remains 
finite as x— * w [condition (3)], so must £{y(x,t)\. Hence, B(s) must be zero. Furthermore, 
putting * = 0 in (7) after B($) is set equal to zero, we have £(//(0,<)j = A (s'), and, from the 
boundary condition (2), we have £[y(0,<)) = £{/(0 }■ Therefore, (7) becomes 

£(y(or,t)} = £[/(«)}e- ( " 0) * 

The inverse of this can be found at once by suppressing the exponential factor and using 
Corollary 2 of Theorem 6, Sec. 7.4. The solution to our problem is, therefore, 

which represents a wave traveling to the right along the string with velocity a. Evidently, the 
effect of this wave is to give the string at a general point the same displacement that the left 
end of the string had x/a units of time earlier. 


EXAMPLE 2 

A semi-infinite string is initially at rest in a position coinciding with the positive half of the ar-axis. 
A concentrated transverse force of magnitude Fa moves along the string with constant velocity v, 
beginning at i = 0 at the point x — 0. Find the displacement y(x,t) of the string at any point 
at any subsequent time. 

In this problem, since there is an external force applied to the string, we must use the non- 
homogeneous wave equation [Eq. (2), Sec. 8.2] 


dt a 


dx * w 


+ ~F(x,t) 


To obtain F(x,t) we observe that a single concentrated load Fa acting at the point x — vt corre- 
sponds to a load per unit length which is infinite at x — vt and zero everywhere else. Hence, 
since F o is assumed to act on the string in the negative ^-direction, 


F(x,t ) - -FaS^t-^j 
ie, or 5 fur 


where 8(t — x/v) is the unit impulse, 
therefore, is to solve the equation 

( 8 ) 

at' ax' 

subject to the boundary conditions 

( 9 ) 2 /( 0, 0 - 0 

(10) y(x, t) bounded as 
and the initial conditions 


8 function, which we discussed in Sec. 7.7. Our problem, 


( 11 ) 

( 12 ) 


y(x,Q) 

dy\ 

dt ko 


0 

0 


If we take the Laplace transform of Eq. (8) with respect to t and use the initial conditions 
(11) and (12), we obtain, just as in Example 1, 

s'£{y(x,t)} = a>-f-£{y(x,t)} - F 0 
ax* u> 


SEC. 8.7 


LAPLACE TRANSFORM METHODS 


341 


gF o , . 
- ___ 


The solution of this equation by the methods of Chap. 2 presents no difficulty, and we find, for 
the complete solution, 

/ g» s F 0 


£\y(x,t)\ = 4(s)e _l 


\ w(a 3 — » 2 )s 2 


In each case we must have B(s ) = 0 in order that £\y(x,t) J should remain finite as x —> oo . 
To determine ^l(s) we have, from the boundary condition (9), the information that, when x — 0, 

£{y(x,t)} ^£{y(0,<)} = 0 
Hence, substituting into Eq. (14), we obtain 
g»*F o 


A(s) = 


and, therefore, 


w(a 2 — r 2 )s 2 

/ ° viFa 


£{y{x,t) = 

Taking inverses, we have finally 


w(a 2 — y 2 )« 2 

gF, 

2was 


(15) 
and 

(16) 




y(x,t) = 


/ *\ 

* « I t J v — a 

\ <v 


Plots of (16) in the “subsonic” case v — %a, and the “supersonic” case v = ^a, and of the 
“transonic” ease y = a described by (16) are shown in Fig. 8.14 for a typical time t. The dis- 


FIGURE 8.14 
Plot showing 
the displace- 
ment of a semi- 
infinite string 
produced by a 
concentrated 
force moving 
along the string 
with a velocity 
of (a) %, (b) 1, 
and (c) % 
times the 
propagation 
velocity for the 
string. 


|.y 





342 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 8 


continuity in y(x,t) in the “transonic” case when the disturbance travels with exactly the 
propagation velocity a is interesting. 

EXAMPLE 3 

A semi-infinite cable of negligible leakage and inductance is initially “dead.” At t — 0 an arbi- 
trary signal voltage E(t) is suddenly applied at the sending end. Find the potential e(x ,() at an}' 
point on the cable at any subsequent time. 

In this problem we have to solve the telegraph equation (21a), Sec. 8.2, 


(17) 


cde _ s aef 
dx* ~ a dt 


a* = RC 


subject to the boundary conditions 


(18) e(0,<) = E(t) 

(19) e(x,t) bounded as a; — > » 

and the initial condition 

(20) e(.r,0) = 0 

Taking the Laplace transform of (17) with respect to t and using the initial condition (20), 
we obtain 

d, 1 

— £{e(x,t) } = a 2 s£{e(a:,<)l 
ax 2 

as the ordinary differential equation satisfied by the transform of the potential. Solving this for 
£{e(a:,f)} we find without difficulty that 

(21) £le(x,t)\ *° A(s)e- a '/* x + BWe 0 '/** 

Since e(x,t) and, hence, £|e(a;,f)) are to remain finite as x — » «=, it is necessary that B(s) ~ 0. 
To determine A(s) we observe that, when x == 0, 

£{e(x,f)j - £{E(f)l 

Hence, substituting into Eq. (21), we find 
A(s) =£{£(«)) 
and 


(22) £\e(x,t)\ ^ £{E{t)\e~ a ^ 

To determine e(x,t) it will be necessary to use the convolution theorem, but, before this 
can be done, we must know the inverse of e,~ ax ^ 8 . Up to this point in our work we have not 
encountered any function of t having this function of s for its transform. However, it can be 
shown (see Exercises 1 and 2) that 


2 \/ 7T J 


-byA 


Hence, taking b 
(23) 


ax and setting up the convolution integral, 
ax ft g-aV/4X 

7= / E(t - X) d\ 

2 x/ttJo X\/x 

In particular, if E(t) is a unit step voltage, we have, since u(t 


obtain from (22) 


e(x,t) = 


- X) = 1 for X < t, and 


t This is identical with the one-dimensional heat equation, and so all our 
conclusions apply equally well to the problem of the flow of heat in a slender, 
insulated, semi-infinite rod whose left end is maintained at the time-dependent 
temperature i 


SEC. 8.7 


LAPLACE TRANSFORM METHODS 


343 


u(t - X) = 0 for X > t, 

ax ft e -® s *V 4 V 

«(*»<> = r~7= L — 7 - dx 
2Vs- j0 x yx 

If we let aV/4X = z 2 , then X = a 2 x 2 /iz 2 , d\ = —a 2 x-/2z 3 dz, and the last integral becomes 


ax 

Cax/iy/t __ „ 8 Z 3 ( 

2 \/^ 

> 6 2 a*i* \ 

2 j 


vw< 


2 1 

r “ » 2 f 


* e ~ vih 


(-£*) 


(24) - — 7 = /„ e --> 5 dz i e~*’ dz 

Under the substitution z 2 = v, the first integral becomes 

since T(M) — a / 7r - Hence, Eq. (24) can be written 


2 fax/ 2 \/< 

e(.t,f) = 1 7 = / e“ s dz 

■y/jr 


(25) 

i r aX 

= 1 — erf j. 

2 s/t 

where 

2 fo , 

(26) 

e ~‘* 


This is the so-called error function, a tabulated function which can be found in most handbooks 
of mathematical tables.* 


EXERCISES 



show by means of the substitution « = \/z that 

/(X) = \/x J Q 6 ' ~ H ■ ■ du 


Hence, by differentiating the first expression for /(X), show that 


/'(X) = 


/(X) 

Vx 


* Actually, most handbooks list not the error function as here defined and 
used in physics and engineering, but rather the so-called probability integral 
of mathematical statistics: 

4>(fi) = — ~ [ e~ wl12 dw 
V2ir JO 

If the substitution z = w/y/2 is made in the error function (26), it becomes 



and we obtain the relation 
erf 6 — 2 $(V 2 8 ) 


344 


PARTIAL DIFFERENTIAL EQUATIONS 


CHAP. 


Solve this differential equation, using the fact that 

m = T(H) - 

and show that 

/( X ) = V^- 2 ^ 


2 


Finally, use this result to show that 



Use the results of the last exercise, together with Theorem 8, Sec. 7.4, to show that 


3 


4 


6 


6 


7 


10 


[ be | 


- e-w: 


a In Example 3, what is the response of the line if E(t) is a unit impulse voltage? (Hint: 
Recall from Sec. 7.7 the relation between the response of a system to a unit step function 
and to a unit impulse.) 

b Using Eq. (25) and the appropriate Duhamel formula, obtain a formula different from 
Eq. (23) for the response of the line in Example 3 to a general voltage. 

Using Laplace transform methods, determine, the motion of a string of length l whose 
initial displacement and initial velocity are, respectively, y(a;,0) ~ sin mvx/l and y(x, 0) = 
sin n-irx/l. Can these results be used to obtain the motion of the string produced by arbi- 
trary initial conditions? How? 

Using Laplace transform methods, determine the response of a string of length l to a dis- 
tributed force f(x,t) = (sin mtxfl ) sin ut if the string is initially at rest in its equilibrium 
position. Explain how these results can be used to determine the response of the string 
to a distributed force f(x, t) = y(x) sin ut, where /j(x) is defined arbitrarily on the interval 
0 < x < l, and to a distributed force (sin nvx/l ) h(t), where h(t) is an arbitrary periodic 
function whose frequency is distinct from each natural frequency of the string. 

Work Example 2, given that the transverse force which moves along the string is a rec- 
tangular pulse of height F o initially acting on the portion of the string between x — Q and 
x = 1. 

A semi-infinite string whose weight per unit length is w has its left end fixed at the origin. 
The infinite end is fastened to a ring which slides without friction along a vertical rod. 
Initially, the string is at rest in a position coinciding with the positive a:-axis. At l — 0 the 
support which maintained the string in its horizontal position is removed and the string 
begins to fall freely under the influence of gravity. Determine its subsequent position as a 
function of x and t. 

A shaft of uniform cross section is built in at x — 0 and free at x = 1. At t — 0, while the 
shaft is at rest in its equilibrium position, a constant torque T 0 is suddenly applied to the 
free end. Find the Laplace transform of the resultant angular displacement. What is the 
angular displacement of the free end as a function of time? (Hint: The boundary condition 
30 

at x ~ l is EJ — = T o. 

dx 

Work Exercise 8, given that the torque applied at the free end is a unit impulse instead of a 
step function. 

A semi-infinite string initially at rest in a position coinciding with the positive a;-axis is 
acted upon by a concentrated force F o sin wt applied at the point x — b. Find the Laplace 
transform of the resultant displacement of the string. What is the displacement of the 
string at the point x = b as a function of time? 


CHAPTER NINE 


Bessel Functions 
and Legendre 
Polynomials 


TheoreficaS preliminaries 

In solving partial differential equations by the method of separa- 
tion of variables, we are often led to ordinary differential equa- 
tions with variable coefficients which cannot be solved in terms of 
familiar functions. The usual procedure in such cases is to obtain 
solutions in the form of infinite series, which can be taken as the 
definitions of new functions to be studied in detail and eventually 
tabulated if they prove of sufficient importance. In this section we 
shall discuss the general problem of obtaining series solutions of 
the form 

(1) y - (x - a) r [ao + a : (s - a) 4- a 2 (x - a) 2 + • • •] 
for the general linear second-order differential equation 

(2) y" + P(x)y' + Q(x)y = 0 

We shall not require the exponent r to be a positive integer, and in 
general it will not be. Hence the solutions we obtain will usually 
not be Taylor expansions. 

The analysis involves a consideration of several cases, 
depending upon the behavior of the coefficient functions P(x) 
and Q(x) at the point x — a around which we propose to expand 
the solution y. In most of our work, the variables x and y and 
the coefficient functions P(x) and Q(x) will all be real. However, 
this is not a necessary restriction, and, in the basic definitions 
and theorems we shall introduce in this section, x, y, P( x), and 
Q(x) may be either real or , complex. In the first place, both P(x) 
and Q(x) may be analytic at x = a; that is, they may possess 
Taylor expansions around the point x — a. When this happens, 
x — a is said to be an ordinary point of the differential equation. 
A point which is not an ordinary point is called a singular point. 
At a singular point, although P(x) and Q(x) do not both possess 


346 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


Taylor expansions, it may be that the products 
( x — a)P(x) and (x — a) 2 Q(x) 


do have Taylor expansions. A singular point at which this is the 
case is said to be regular; otherwise it is called irregular. In our 
work we shall be concerned exclusively with the expansion of solur . 
tions of Eq. (2) around ordinary points and regular singular points. 


EXAMPLE 1 
For the differential equation 


y" + - y' + - 


x = 0 and x — 1 are singular points, since at x = 0 both P(x) and Q(x) become infinite, while 
at x = 1, although P(x) is analytic, Q{x) becomes infinite. All other points are ordinary points. 
The point x = 0 is a regular singular point, since each of the products 


3 

' x(x - l) 3 


is analytic at x = 0, i.e., can be expanded in a series of positive integral powers of x. The point 
x = 1 is an irregular singular point, however, because, although the product 

(x - l)P(x) = (x - 1) x 

is analytic at x = 1, the product 


(x- 


1) 2 G(Z) - (* - l) a 


3 

x(x - l) 3 


3 

x(x - 1) 


becomes infinite there and hence is not analytic. 


The importance of the classification of values of x i nto ordinary 
and singular points is apparent from the following theorems; which 
are proved in more advanced treatments of the theory of differen- 
tial equations.* 


THEOREM 1 

At an ordinary point x — a of the differential equation y" + P(x)y' + Q(x)y — 0, 
every solution is analytic; i.e., can be represented by a series of the form 

V - a 0 + ai(x - a) + a 2 (x — a) 2 + 

Moreover, the radius of convergence of each series solution is equal to the distance 
from a to the nearest singular point of the equation. 

THEOREM 2 

At a regular singular point x — a of the differential equation 
y" + P(x)y' + Q(x)y = 0 


* See, for instance, E. T. Whittaker and G. N. Watson, “Modem Analysis,” 
pp. 194-203, The Macmillan Company, New York, 1943. 


SEC. 9.1 


THEORETICAL PRELIMINARIES 


347 


there is at least one solution which possesses an expansion of the form 
y = (x — a) r [a 0 + ai(x — a) + a 2 (x — a) 2 + ‘ ‘ *] 

and this series will converge for 0 < |x — a\ < R, where R is the distance from a 
to the nearest of the other singular points of the equation. 

THEOREM 3 

At an irregular singular point x — a of the differential equation 
V" + P(x)y' + Q(x)y = 0 

there are in general no solutions with expansions consisting solely of powers of 
x — a. 

In using Theorems 1 and 2 to infer the radius of convergence 
of power series solutions of Eq. (2), it must be borne in mind that 
the singular point nearest to, but distinct from, the point of expan- 
sion may be complex, even though the point around which we are 
expanding is real. For instance, for the differential equation 

y " + TTtf V ' + v “ 0 

the coefficient functions 

P(x) = and <3^) = 1 

are analytic for all real values of x. However, P(x) fails to be 
analytic at x — ±i; hence, fhese two points are singular points of 
the differential equation. Therefore, a series solution around the 
ordinary point x = 2, say, would have radius of convergence 
R = s/ 5, since in the complex plane the distance from the point 
of expansion a: = 2 to the nearest singular point x — i (or x — — i ) 
is \/5 (Fig. 9.1). 



point or a reqular singular point we use the so-called method of 
Frobenius.* First of all, for convenience, we translate axes, if 


Named for the German mathematician F. G. Frobenius (1849-1917). 


348 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


necessary, so that the point of expansion becomes the point x — 0. 
Now, if x = 0 is either an ordinary point or a regular singular 
point, both xP(x) and x 2 Q(x) are analytic, and, hence, we can 
write 

xP(x) — &o 4 + b 2 x 2 4 ' ' ' 

x 2 Q(x) = c 0 4 cix + c 2 x 2 + • • ■ 

Therefore, multiplying Eq. (2) by x 2 and then substituting for 
xP(x) and x 2 Q(x), we have 

(3) x 2 y" 4 ®(6 0 4 bix + b 2 x 2 4 • ■ -)y' 

+ (c 0 4 cix 4 c 2 x 2 4 • • -)y - 0 
Next, we assume a series of the desired form 

(4) y = x r (a 0 4 a x x 4 a 2 x 2 4 ' ' 0 

where, without loss of generality, we can suppose that a 0 s* 0. If 
we substitute this series into Eq. (3), we have 

x 2 \a<tr(r — l)® r-2 4 oi (r 4 l)rx r-1 4 azO' 4 2)(r 4 l)x r 4 ■ • *] 
4 x(b 0 4 bix 4 b 2 x 2 4 ■ • i )’[«oras r— 1 4 a,i(r 4 l)z r 
4 a 2 (r 4 2):c r+1 4 * • •] 4 (co 4 Cix 4 c 2 x 2 4 • • •) 

X (a 0 x r 4 ai^ r+1 4 a 2 x r+2 4 * • ■) = 0 
or, collecting terms on the various powers of x, 

(5) a 0 [r(r - 1) 4 b 0 r 4 c 0 ]x r 4 {ai[(r 4 l)r 4 6 0 (r 4 1) 4 c 0 ] 

4 a>o(bir 4 ci) }af +1 4 {o> 2 [{r 4 2 )(r 4 1) 4 6 0 (r 4 2) 4 Co] 

4 ai[bi(r 4 1) 4-cJ + a Q (b 2 r 4 c 2 ))x r + 2 4 * • ■ = 0 

Equation (5) will be an identity if and only if the coefficient of 
each power of x is zero, and thus we obtain the set of equations: 

a 0 [r(r — 1) 4 b Q r 4 c 0 ] = 0 

(6) 4 l)r 4 6o(r 4 1) 4 co] 4 ao(6i?’ 4 ci) = 0 
a 2 [(r 4 2)(r 4 1) 4 6o(r 4 2) 4 co] 4 o, x [bi(r 4 1) 4 ci] 4 00(62?' 4 C2) = 0 


Since a 0 4 0, it follows from the first of these equations that 

(7) ?’ 2 4 (60 — l)r 4 Co = 0 

This quadratic equation in r is known as the indicial equation 
of the differential equation relative to the point of expansion, 
and its roots r x and r 2 are known as the exponents of the differ- 
ential equation at that point. For each of these values there is, in 
general, a series solution of the form (4). And the coefficients in 
these expansions can be determined, one by one, from the suc- 
cessive equations in the set (6), which express each of the a’s, 
in turn, in terms of the a’s preceding it in the series (4). 


SEC. 9.1 


THEORETICAL PRELIMINARIES 


349 


EXAMPLE 2 

Find series solutions for the equation 9 x 2 y" + (x + 2 )y — 0 around the origin. 

Since P(x) — 0 and Q(x) — (x + 2) /9a: 2 , it follows that the origin is a regular singular 
point of the given equation. Hence, by Theorem 2, there exists at least one solution with an 
expansion of the form 

y = x r (a<, + aix + cnx 2 + • • •) 

Substituting this into the differential equation, we have 

9a: 2 [oor(r — l)ar’ 2 + <h(r + l)ra: r ~ l + • • • + a k+ i(r + k + l)(r + k)x r+k ~ i + ■ ■ •] 

+ x( a a x T + ■ • • + a k x T+k + • • *) 

+ 2( aox r + a,ai r+ « + • • • + a* +l z r +*+> +••■)“ 0 

or, collecting terms, 

«o[9r(r — 1) + 2]x r •+■ {«j[9(r + l)r + 2] + «o}a: r+1 + * • • 

+ {a*! + i[9(r + k + l)(r + k) + 2] + a*}x r+fc+l + • • • = 0 
For this to be an identity we must have 

9r(r - 1) + 2 = 0 
Oi[9(r + l)r + 2] + a 0 - 0 

a* + ,[9(r + k + l)(r + k) + 2] + a k - 0 


The first of these is the indicial equation whose roots are r = H,H- From the second we find 
that 

do 

ai = ~ (3r + l)(3r + 2) 
and, from the general recurrence relation, we have 


[3(r + k) + 1 1 [3 (r + k ) + 2] 

Considering these first for r = li and then for r = %, we obtain the coefficient sequences 


CLi 




2-3 

as — — 

_ an 
~ 3-4 

a 3 = — - 


a o 


a i 


2 • 3 • 5 • 6 • 8 ■ 9 
7 = 3 • 4 • 6 ■ 7 


•4 -6-7 -9 -10 


With these coefficients, taking a» = 1 for convenience, we can construct the two particular 
solutions 

Wl = ( 1 — — 1 — — + ■ • • j 

J \ 2 • 3 2 ■ 3 • 5 • 6 2 • 3 • 5 • 6 • 8 • 9 J 

,, / X X* X 3 , \ 

J \ 3-43 • 4 ■ 6 • 7 3 • 4 ■ 6 • 7 ■ 9 • 10 ~ / 

Since all values of x except x = 0 are ordinary points of the given differential equation, it follows 
from Theorem 2 that these series converge for all values of s. Finally, since y i and y t are clearly 
independent, i.e., have nonvanishing Wronskian, it follows from Theorem 2, Sec. 2.1, that the 


350 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


complete solution of the given equation is an arbitrary linear combination of these two particular 
solutions. 

If the indicial equation has a double root, it is obvious that 
two series solutions cannot be obtained by the present method. 
It is also true (though not obvious) that, if the roots of the indicial 
equation differ by an integer, this method fails, in general, to 
provide a second series solution.* In either of these cases, how- 
ever, a second solution can be found by the method of Sec. 2.1, 
that is, by assuming y — <j>(x)yi(x), where yi(x) is the first series 
solution, and then determining <j>(x) so that the product will 
satisfy the given differential equation. 

EXERCISES 

1 Find the singular points of each of the following equations, and determine whether they 
are regular or irregular: 

a xy" 4 y' 4 y = 0 b x 2 y" 4 2/' 4 ?/ = 0 

c z 2 (l - x)y" + (1 — x)y' + y - 0 d (1 — x*)y" 4 y' + y - 0 

2 Find the indicial equation relative to each of the singular points of each of the equations 
in Exercise 1. 

Find series solutions around the origin for each of the following equations: 
pi 3 4 x 2 y" + (4z 4 l)y = 0 • 4 x*y" 4 (x - 2)y = 0 

6 2xhy" + 3 xy' + (as* - l)y = 0 6 2x*y" 4 (2.t 2 4 3 x)y' 4 (s - 1 )y = 0 

7 The "point at infinity” is said to be an ordinary point or a singular point of the differential 
equation 

^4PW^4» = 0 
dx 2 dx 

according as the equation obtained from this by the substitution x = 1/m has an ordinary 
point or a singular point at u = 0. Show that, under this transformation, the original 
equation becomes 



and use this result to determine the nature of the point at infinity for the equation 

(x 2 4 l )y" + y' + y = Q 

8 Show that, if the origin is an irregular singular point of the differential equation (2), then 
the indicial equation relative to the origin is at most of the first degree. 

9 Verify that, if the roots of the indicial equation differ by unity, then, in general, the two 
roots lead to the same series solution of Eq. (2). When will this not be the case? Is this true 
if the roots differ by an integer greater than 1 ? 

10 Verify that under the change of dependent variable defined by the substitution 

y = ze -Vt!P(x)dx 

the differential equation (2) becomes 
d*z 

— 4 R( x)z = 0 

where R(x) = Q(x) - i - -PKx). 

2 dx 4 


See Exercise 9. 


SBC. 9.2 


THE SERIES SOLUTION OF BESSEL’S EQUATION 


351 


9.2 

The series solution of Bessel’s equation 


( 1 ) 


(2) 


(3) 


Probably the most important of all variable-coefficient differential 
equations is 


2 d^y 

dx 2 


dy 

dx 


+ (xv 


**)v - 0 


which is known as Bessel’s equation of order v with parameter X.f 
This arises in a great variety of problems, including almost all 
applications involving partial differential equations, such as the 
wave equation or the heat equation, in regions having circular 
symmetry. 

As a preliminary step in the solution of Eq. (1), let us change 
the independent variable from x to t by means of the substitution 
t — Xx 


Since 


^ = X — 
dx dt 


and 


<Py , 2 d?y 

dx 2 dt 2 


Eq. (1) then becomes 


v *)v 


0 


which is known simply as Bessel’s equation of order v. 

For Eq. (3) it is clear that 

P(t) = i and Q(0 = 

Hence, the origin is a regular singular point of the equation, and 
all other values of t are ordinary points. At the origin, where we 
propose to obtain series solutions of (3), the indicial equation 
[Eq. (7), Sec. 9.1] is r 2 — v 2 = 0, and, therefore, by the theory 
of the preceding section, we are led to try a series solution of the 
form 


(4) y — t v (a o -f- o>it + a it z + • • •) 

Substituting this series into Eq. (3) and displaying the terms 
in a convenient array, we have 

loo*** - pi” + at<» + Dot”*' + ai(v + 2)(P + Dt”*' + ■ ■ • + «(» + *)(» + k - Dt”** + ••■].- 
+ [ am >t v + ai(t> ■+■ + a»( v + 2 }t”*' + • • • + a*(» + k)t”* k + • ■ ■] 

+ { ad”*' + • • • + ak-it”** + ■ • •) 

+ ( - v'aot 11 - v'ait”*' - v'ait”*' - • ■ ■ - v'att”* k +■■•) = 0 

This will be an identity if and only if the coefficient of every 
power of t is zero. The coefficient of t" is automatically zero, since 
v is a root of the indicial equation. From the coefficient of t r+1 


f Named for the German mathematician and astronomer Friedrich Wilhelm 
Bessel (1784-1846), although special cases of this equation had been studied 
earlier by Jakob Bernoulli (1703), Daniel Bernoulli (1732), and Leonhard 
Euler (1764). 


352 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


we obtain the condition 

Oi(2p + 1) = 0 

and, in general, for k ^ 2, we obtain from the coefficient of t v+k 
ctk[(v + fc)(p + k — 1) + (v + k) — v 2 ] + ci’k-t = cikk(2v + k) + a,k - 2 = 0 
or 

— ttfc -2 

k k(2v + k) 

Now it is clear that 01 must be zero for all values of v except 
possibly v = — }i, and even in this case we can assume ai = 0, 
since we are interested only in conditions sufficient for the exist- 
ence of solutions of the form 

1 ad’» 

fc = 0 

Moreover, from (5) it is apparent that any coefficient ak is a 
multiple of the second preceding coefficient a*_ 2- Hence, beginning 
with oi, it follows that every coefficient with an odd subscript 
must vanish. 

On the other hand, starting with a 0 , which is still perfectly 
arbitrary, and taking k — 2, 4, 6, . . . successively in the recur- 
rence formula (5), we have 

do — Clo 

do _ Oo 

” 2(27+2) 2 2 • l\(v + 1) 

do _ do __ da 

~ 4(27+4) 2 2 2{v + 2) ~ 2* ■ 21(7+ 2)(p + 1) 

O4 __ dn __ do 

~ 6(27+6) 2 2 • 3(p + 3) 2® • 3!(p + 3)(v + 2)(» + 1) 

and, in general, <*. = + —"fr + " 2 )~(7 +T) 

Now dm is the coefficient of t v t lm ss t v+2m in the series (4) 
for y. Hence it would be convenient if a 2m contained the factor 
2 v+2m in its denominator instead of just 2 2m . To achieve this, we 
write 

a%m = 2' ,+2m m\(v + m) * — (* + 2)(p + 1) 

Furthermore, the factors 
(v + w) • • • (v + 2){v + 1) 

suggest a factorial. In fact, if v were an integer, a factorial could 
be created by multiplying numerator and denominator by p! 
However, since v is not necessarily an integer, we must use not 
v\ but its generalization T(n + 1) (Sec. 7.3) for this purpose. 


di — 
do = 


SEC. 9.2 


THE SERIES SOLUTION OF BESSEL'S EQUATION 


353 


Then, except for the values 
v = -1, -2, -3, . . . 

for which r(v + 1) is not defined, we can write 

hm = 2 v+2,n ml(v + m) ■ • • 0 + 2) ( 7 + l)r(* + 1) ^ 1 ~ t ~ 1)a ° 5 
Since the gamma function satisfies the recurrence relation 
zT(z) = T(z + 1) 

the expression for a 2m becomes finally 

a2m = 2 v+2m mVT(v + m + 1) [2VF ^ + 1)ao] 

Since a 0 is arbitrary and since we are looking only for particular 
solutions, we choose 

1 

a ° “ 2>T(v + 1) 

so that, a 2m = r~r\ 

2 v+2m m\Y(v + m + 1) 

The series for y is, therefore, from (4), 

y (t) = f [ 2 ^r(v + 1) ” &+*r(r + 2) + 2^2\Y{v + 3) “ ' ' '] 


Y (—l)H v+2m 

L 2 v+Sm rn\T(v + m + 1) 


The function defined by this infinite series is known as the 
Bessel function of the first kind of order v and is denoted by the 
symbol J v (t). Since Bessel’s equation of order v has no finite 
singular points except the origin, it follows from Theorem 2, 
Sec. 9.1, that the series for J v {t ) converges for all values of t if 
v £s 0. The graphs of J Q (i) and Ji(t) are shown in Fig. 9.2. Their 
resemblance to the graphs of cos t and sin t is interesting. In 
particular, they illustrate the important fact that for every value 
of v the equation J v (t) = 0 has infinitely many real roots. 

Let us now consider the series arising from the other root 
of the indicial equation, namely, r — —v. We could, of course, 
begin again with a series analogous to (4) and determine its 
coefficients one by one, just as we did for There is no need 



354 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


to do this, however, for the final result can be obtained at once 
simply by replacing v by — v in the series (6), provided that the 
gamma functions appearing in the denominators of the various 
terms are all defined. This is necessarily the case unless v is an 
integer; hence when v is not an integer the function 

j (i ) - f (-1)**- 

ot 4o 2-‘ ,+2m m! ] r(- v + m + 1) 

is a second particular solution of Bessel’s equation of order v. 
Moreover, since /_,(<) contains negative powers of i while J„(t) 
does not, it is obvious that in the neighborhood of the origin 
/_„(/) is unbounded while J v (t) remains finite. Hence «/„(/) and 
J- P (t) cannot be proportional and, therefore, are two independent 
solutions of the Bessel equation. According to Theorem 2, Sec. 2.1, 
a complete solution of Bessel’s equation when v is not an integer 
is then 

2/00 - ciJ„(t) + c 2 J-ptf) 

Instead of some writers take the linear combination 

Y (t) — cos v irJp(f) ~ J-v{t) 
sin w 

as a second, independent .solution of Bessel’s equation. Using 
Yp(t), which is known as the Bessel function of the second kind 
of order v, a complete solution of Bessel’s equation can be 
written, 

2/(0 = Ci/„(0 + c 2 F„(0 v not an integer 

In some applications it is convenient to use still another 
form of the general solution of Bessel’s equation. This is based 
upon the two particular solutions 

#, (1) (0 = J,(t) + iY p (t) 

H^(t) = J,(t) - iYpii) 

These are known as Hankel functions* or Bessel functions of 
the third kind of order v, and in terms of them a complete solu- 
tion of Eq, (3) can be written, 

y(t) — + c 2 H p {i) (t) v not an integer 

It is interesting to note that (8), (10), and (12) are correct 
expressions for the general solution of Eq. (3) even when v is 
an odd multiple of Y% and the roots of the indicial equation 
r 2 — v 2 = 0 differ by an integer. In the last section we pointed 
out that, when this happens, a second, independent series solu- 
tion of the form (4) will usually not exist. It may exist, how- 
ever, and Bessel’s equation is one of the instances when it actually 
does.f 


* Named for the German mathematician Hermann Hankel (1839-1873). 

f See Exercise 9, Sec. 9.1. 


SEC. 9.2 


THE SERIES SOLUTION OF BESSEL’S EQUATION 


355 


(13) 


( 14 ) 


If v is an integer, say v = n, the situation is somewhat differ- 
ent. Again the roots of the indieial equation differ by an integer, 
namely, 2 n, and it is to be expected that a second solution of the 
form (4) will not exist. In fact, considering J~ n (t) as the limit 
of Jp(t) as v approaches —n and remembering that when its 
argument approaches any nonpositive integer the gamma func- 
tion becomes infinite, it follows that as v approaches —n, the 
first n terms in the series (6) approach zero and the series effec- 
tively begins with the term for which m = tv 


= i 


( |^m£--n+2m 

2-»+2m m !r (— n + m + 1) 


In this, let the variable of summation be changed from m to j 
by the substitution m — j + n. Then 


r V (-l)^-» + 2(j + n) 

J - nW jU 2 —+*<*■»> (j + n)\T[^-n + (j + n) + 11 


V (— 1)~( — 1 ) J I n+2, ‘ 

2 b+2, T(?i + j + l)j! 

= \-l) n J n (t) 

Thus, when v is an integer n, the function J- V (t) is proportional 
to J v (t). These two solutions are therefore not independent, and 
the linear combination C\J ,{t) -\- c% J~»(t) is no longer a com- 
plete solution of Bessel’s equation. Moreover, without additional 
definitions, neither (10) nor (12) provides a complete solution, 
since Y v (t), as defined by (9), assumes the indeterminate form 
0/0 when v is an integer. 

A complete solution when v is an integer can be found in 
either of several ways. One is to use the method developed in 
Sec. 2.1 for finding a second solution of a linear second-order 
differential equation when one solution is known. The result, as 
given by Eq. (5) of Sec. 2.1 with yi(t) = J n Q) and Pit) = l/t, is 

y(t ) -c7.(i) / jj§(cj + l‘Ut) 

The usual procedure, however, is to obtain a second, independent 
solution by evaluating the limit of Y „(f) as v —> n. The details 
are somewhat involved, and we shall not present them here. 
The limit function, which exists and is independent of J„(i) for 
all values of n, is commonly denoted by Y n (t ) ; that is, 


Y n (t) = lim Y„(t) = lim 


COS VTT Jy(t) — j-pjt) 

sin vtt 


The corresponding specializations of the Hankel functions (11) 
are defined in the obvious way in terms of F„(i) : 


Il n (l) (t) = J n (t) + iYn(f) 
Hn^it) = Jn(t) ~ YY n {t) 


3S6 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


With Formulas (13) and (14), we can now eliminate from (10) 
and (12) the restriction that v is not an integer and use these 
results for all values of v, integral as well as nonintegral. Plots 
of 7 0 (0 and Yi(t) are shown in Fig. 9.3. Among other things, 
they illustrate the important fact that, for all values of v, Y p (t) 
is unbounded in the neighborhood of the origin and has infinitely 
many real zeros. 



Reversing the transformation (2) which we used to eliminate 
the parameter X from the general form of Bessel’s equation (1), 
we can now summarize the results of the preceding discussion 
in the following theorem: 

THEOREM 1 

For all values of v, a complete solution of Bessel’s equation of order v with 
parameter X, 

x 2 y" + %y' + (XV - v*)y = 0 
can be written in either of the forms 

y(x) = CiJ y (\x) + c 2 Yy(\x) 
y(x) - cxH^Ckx) + CfH^0<x) 

If v is not an integer, a complete solution can also be written 
y(x)= CiJy(Xx) H- c 2 J-y(Xx) 

J P (Xx), J_,(X x), and Y v (kx) all have infinitely many real zeros. If v ^ 0, J>(\x) 
is finite for all values of x, hut J~ v {\x) and Y v (Xx) are unbounded in the neighbor- 
hood of the origin. H^Ckx) and HJ 2) (kx) are complex-valued functions when x 
is real. 

EXERCISES 

1 If 2/1 and y 2 are any two solutions of Bessel’s equation of order v, show that yiy' 2 — y[yt = c/%, 
where c is a suitable constant. (Hint: Recall Abel’s identity from Sec. 2.1.) 


SEC. 9.3 


MODIFIED BESSEL FUNCTIONS 


35 7 


2 By determining the coefficient of l/x on the left-hand side, show that 

/ , 2 

J v{x)J _„(x) — J v (x)J-v(x) = sin vk 

irx 

[Hint: Use the result of Exercise 1 and the fact that r(z)r(l — z) = 7r/(sin rz) if z is not 
an integer.] 

3 If ^ is not an integer, show that J„ Or) and J- y (x) have no zeros in common. (Hint: Use the 
result of Exercise 2.) 

4 Show that 

Y = - 2 - ^ [/,(*) f ds - /.,(x) f /(«)/,( s) ds] 


is a particular integral of the nonhomogeneous Bessel equation 
x~y" + xy' + (x 2 - v 2 )y - xf(x) 


if v is not an integer. (Hint: Use the method of variation of parameters and the results of 
Exercise 2.) 

5 Show that, under the transformation y = u/y/i, Bessel’s equation of order v becomes 
dt 2 \ it 2 j 

Hence show that, for large values of t, solutions of Bessel’s equation are approximately 
described by expressions of the form 


sin t , cos t 

,j — 4. C2 — — 

Vt vt 


[More precisely, it can be shown that 

where the symbol means that the limit of the ratio of the two quantities connected by it, 
approaches 1 as t becomes infinite.] 


Modified Bessel functions 


(1) 


There are certain equations closely resembling Bessel’s equation 
which occur so often that their solutions are also named and 
studied as functions in their own right. The most important of 
these is 


1 dy 


- (* + ?) s 


which is known as the modified Bessel equation of order v. Since 
this can be written in the form 


358 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


it is evident that this is nothing but Bessel’s equation of order v 
with the imaginary parameter X = i. However, in actual appli- 
cations, to write the complete solution of (1) in the form 


y = ciJ y (ix) + czY v {ix) 

and retain the imaginaries would be about as awkward as to 
take the solution of 



to be y — ci cos ix -f- c 2 sin ix 

and use this complex expression instead of resorting to real 
exponentials or hyperbolic functions. Accordingly, we seek modi- 
fications of J„{ix) and Y v (ix) which will be real functions of real 
variables. 


Now, 


Mix) 


L o 2 >+* k k\T{v + k + 1 ) 



x >+-ik 

2”+ 2k JclT(v + k + 1) 


Moreover, J,(ix) multiplied by any constant will also be a solu- 
tion of the equation we are considering. Hence, in particular, 
we can multiply it by i~ v , getting 


(ix \ _ V 

i~o 2 “ + * kklT ( v + * + 1 ) 

This is a completely real function, identical with J v (x) except 
that its terms, instead of alternating in sign, are all positive. 
This new function, which is related to J,(x) in the same way that 
cosh x and sinh x are related to cos x and sin x, is known as the 
modified Bessel function of the first kind of order v, I v (x). If v 
is not an integer, the function I~ r (x) obtained from I v (x) by 
replacing v by — v throughout is a second, independent solution 
of Eq. (1), whose complete solution can therefore be written 

y = cil p (x) + c 2 I~y(x) 


On the other hand, instead of using I~ v {x), many writers 
take the second solution of the modified Bessel equation to be 
the linear combination 


K t (x) 


. tt 7_,(a?) ~ I,( x) 


which is known as the modified Bessel function of the second 
kind of order v. If v is not an integer, this is a well-defined 
solution which is clearly independent of I v (x). If v is an integer n, 
this assumes the indeterminate form 0/0, but a tedious evalua- 


SEC. 9.3 


MODIFIED BESSEL FUNCTIONS 


359 


tion by L’ Hospital's rule leads to a limiting expression 
K n (x) = lim K v (x) = lim £ • 

v-*n p—*n 2 Sin VK 

which is a solution independent of J„(x). This is a useful result, 
because, as we might expect, J„(x) and I-„(x) are not independ- 
ent when v is an integer. In fact, when v — n, we have the identity 

( — 1 )”J_ n (ix) = J n (ix) 
and then, by obvious steps, 

(f 2 )V_„(fx) = J n (ix) 
i n J- n (ix) — i~ n J n (ix) 

I~n(x) = I„(X) 

Plots of I 0 (x) and h(x) are shown in Fig. 9.4; plots of K 0 (x) 
and Ki(x) in Fig. 9.5. As these graphs illustrate, the modified 
Bessel functions have no real zeros except possibly x — 0. They 
also illustrate that, for v St 0, I„(x) is finite at the origin, but 
K v (x), like /_„(#), becomes infinite as x approaches zero. 

Just as the ordinary Bessel equation, so the modified Bessel 
equation frequently occurs in a form containing a parameter X : 

^ + I*_d! + JA V= 0 

dx 2 xdx \ ^ x 2 ) y 

A complete solution of this is, of course, 
y — cj v (\x) + c 2 K„(Xx) v unrestricted 



360 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 



If v is not an integer, we have the alternative form 


(3) 


y = cj r (\x) + cJ- 0 Qa) 


A second equation closely related to Bessel’s equation is 


&V , , 

dx 2 " r x dx 




0 


This can be regarded either as Bessel’s equation of order v with 
parameter X — ± -\/—i or as the modified Bessel equation of 
order v with parameter X = + s/i. From the former point of 
view a complete solution can be written 


y = C\J „(± s/ — i x) -j- c 2 y,(± 


From the second point of view the solution can be written 


y - dil p (± Vi’*) 4* d 2 K*(± -\/ix) 

Now a complete solution can be constructed from any pair of in- 
dependent particular solutions; and it is customary in studying 
Eq. (3) to select J p (± \/ —i x ) and AC„(± \A x) for this purpose. 
The solution becomes unambiguous when a choice is made 
between the two square roots in each case. Naturally enough, 
the positive, or principal, square roots are chosen. Then since* 

* ^ • . . IT . I. , . 3?r ... 3^T n ■ Irt 

i - cos 2 + « sin 2 = e n and — i — cos -y -f % sin — e 


it follows that 
y/~i = (<*/•))*. = 


See Formula (7), Sec. 14.7. 


SEC. 9.3 


MODIFIED BESSEL FUNCTIONS 


361 


and we have for the complete solution 

y = c.T v (i % x) + dK„(i^x) 


Now 


J v (i^x) = 


V ( — l) fc (i%)’' +2& 

L, 2’™k\Y{v + Jc + 1 ) 

k = 0 



(-1 )**%"+2fc 

2.+2J= ft !r(„ + fc + l) 


Moreover, i 3k can take on only one of the four values 

1 k = 0, 4, 8, . . . 

-i k * 1, 5, 9, . . . 

- 1 fc = 2, 6, 10, . . . 

i k * 3, 7, 11, . . . 

Hence, the first, third, fifth, . . . terms in J v {i A x) are real, and 
the second, fourth, sixth, ... are imaginary. Separating the 
series into its real and imaginary parts, we obtain 

Jv{%%X) = ^ [ .| Q 2^\2j)\vl + 2 j + 1) 

. v (-1 yx’ +s +v 1 

+ l .l Q 2 p+2+ii (2j + l)!r(* + 2 j + 2) J 

-* P/> [2r + *Z] 

Furthermore, 

i 3p/2 = (e lV/2 )3<'/2 _ e 3 iirp/i — cos { gin *~ 

and, therefore, 

./ „(*%) = ^cos ~ + i sin ^ Q r + i £ .) 

/ 3nr V . 3 v 7 r V' \ 

. . ( 2>VK V . . 3V7T Y \ 

+ *(®Ti + »"Ti) 

J v (i'-x) thus consists of one purely real series plus i times a 
second purely real series. The series forming the real part of this 
expression defines the function ber, x. The series forming the 
imaginary part defines the function bei„ x. The letters be suggest 
the relation between these new functions and the Bessel functions 
themselves. The terminal letters r and i, of course, suggest the 
adjectives real and imaginary. For the important case v — 0, 


362 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


we have explicitly 

i u V (— l)% 4i 

ber 0 x^berx= 

beux-beix- f 

0 ~ be ~ 4 o 2 ^ 2 [( 2 j + l )!] 2 

Plots of ber x and bei x are shown in Fig. 9.(3. The graphs oscillate 
with ever-increasing amplitudes. 



In a similar way the function K y (i A x) can be expressed as a 
real series plus i times another real series. These series are taken 
as the definitions of the new functions ker, x and kei, x, respec- 
tively. A complete solution of Eq. (3) can thus be written 

y — c(ber, x -j~ i bei, x) + d(ker, x + i kei, a:) 

The function ber, x -f- i bei, x is finite at the origin, but becomes 
infinite as x becomes infinite; ker, x + i kei, x is infinite at the 
origin, but approaches zero as x becomes infinite. 

EXERCISES 

1 Show that /,(*-%) = ber, x — i bei, x. 

2 Show that J a(i¥~x) = ber x — i bei x. Is J v {i¥-x) — ber, x — i bei, x in general? 

3 Sho^v that (x her' x)' = —x bei x and that (x bei' x)' — x ber x. 

4 Write out beri x and ben x. 

5 Show' that, under the transformation y — u/ y/x, the modified Bessel equation of order v 
becomes 



SEC. 9.4 


EQUATIONS REDUCIBLE TO BESSEL’S EQUATION 


363 


Hence show that, for large values of x, solutions of the modified Bessel equation are approxi- 
mately described by expressions of the form 

«-* , e 1 

e ‘^ + c 'vi 

(More precisely, it can be shown that, as x becomes infinite. 



9.4 

Equations reducible to Bessel’s equation 

There are many differential equations whose solutions can be 
expressed in terms of Bessel functions. In particular, we have 
the large and important family described in the following theorem: 


THEOREM 1 

If (1 — a) 2 Si 4c and if neither d, p, nor q is zero, then, except in the obvious 
special cases when it reduces to Euler’s equation,* the differential equation 

x 2 y" + x(a -f 2bx v )y' + [c + dx 2q + 6(a + p — l)x p -f b 2 x 2p ]y - 0 
has as a complete solution 

y = x a e~ l3xP [ciJ ,(hx g ) + c 2 F„(Xx 9 )] 

.. 1 - « o = b x _ V\d\ v _ V(1 - a) 2 - 4c 
2 • p q 2 q 

If d < 0, J v and Y„ are to be replaced by J„ and K„ respectively. If v is not an 
integer, Y„ and K v can be replaced by and if desired. 


where 


The proof of this theorem, while straightforward, is lengthy and 
involved, and we shall not present it here. It consists in transform- 
ing the given equation by means of the substitutions 

y s £ ( i-«) / 2 e - (6/p) 2 p y and x = 

and verifying that, when the parameters are properly identified, 
the result is precisely Bessel’s equation. 

One special case of Theorem 1 is of sufficient interest to be 
stated as a corollary: 


f qX y/« 
W\d\) 


COROLLARY 1 

If (1 — r ) 2 St 46, if a 0, and if either s > r — 2 or 6 = 0, then a complete solu- 
tion of the equation 

( x r y ')' + (ax* + bx r ~ z )y = 0 
is y — x“[ci</„(Axt) -(- C2 Y v (\x y )] 


Equation (10), Sec. 2.6. 


364 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


where 

1— r 2-r + s , _ 2 s/\a\ _ s/ {l — r) 2 — 46 

“ ~ ~~2~ 7 ~ 2 2-r + s " 2-r + s 

If a < 0, J v and Y„ are to be replaced by L and K v , respectively. If v is not an 
integer, F„ and K v can be replaced by «/_„ and if desired. 


EXAMPLE 1 

Find a complete solution of the equation 

x*y" + a:(4z 4 — 3)y' + (4a: 8 — 5a: 2 + 3 )y = 0 


Clearly, this is a special case of the equation of Theorem 1 with 

a = —3 6 = 2 p = 4 c = 3 d = —5 2 = 1 

Hence, a = 2 /3 = }£ X = \/ 1 — 5| = \/ 5 and v = 1 

A complete solution is, therefore, 

2 / = a: 2 e“* 4/2 [c 1 f 1 (\/5 *) + CsKi(\/ 5 x>] 

EXAMPLE 2 

What is a complete solution of the equation y" + y — 0? 

Obviously, one possibility is 

y — ci cos x + C 2 sin x 

However, y" + y = 0 is also a special case of the equation of Corollary 1, with r — 0, s = 0, 
a = 1, and 6 = 0. Hence, 

<* = }4 y = l X = 1 v = H 
and so we can also write 


y = di\/x J\i(x) + d s \/x J-m(x) 

It follows, therefore, from Theorem 2, Sec. 2.1, that, for proper choice of the constants ci and Ci, 
each of the particular solutions 

\/ xJyJ^x) and \/x J^y/x) 

must be expressible in the form a cos x + c 2 sin x. 

Now, since r(%) = H r (M) = H '\/* , » the series for /^(x) begins with the term 




2^r(%) ' 

Hence, the series for s/ x J\,$(x) begins with the term •%/ 2/w x. Therefore, if we write 

■\f x Ji^x) — x — • • ■ = ci cos x + c 2 sin x 

and put x = 0 in this identity, we find ci = 0. Subsequently, by equating the coefficients of x, 
we find 

We have thus established the interesting and important result that 

\/x Jtf(x) ~ ij- sin x or J\,i(x) — -\J~~ sin x 


SEC. 9.5 


IDENTITIES FOR THE BESSEL FUNCTIONS 


365 


In a similar manner it can be shown that 





EXERCISES 

Find a complete solution of each of the following equations: 

1 y" + x m y = 0 2 xy" + 2 y' + ±xy = 0 

3 x 2 y" + 3xy' + (1 4- x)y = 0 4 xy" — y' + kx^y - 0 

6 xhj" + 2x 2 y' + (z 4 4- x* — 2)y = 0 6 x 2 y" + (2** + x)y‘ -f- (x a + 3x - l)y = 0 

7 Show that 



8 Show that any solution of 

(x m ~ l y')' = kx m ~ 2 y or (x m ~Y)' = ~kx m ~ 2 y 
will also satisfy the equation ( x m y ")" - k 2 x m ~ 2 y. 

9 What is a complete solution of (xh/')" ~ 9?/? 

10 What is a complete solution of xhj IV + 8xy'" + 12 y" — y =• 0? 


9.5 

identifies for the Bessel functions 

The Bessel functions are related by an amazing array of identities. 
Fundamental among these are the consequences of the following 
pair of theorems: 


THEOREM 1 

d[x’J,(x)] 

dx 


x v J v - i(aO 


PROOF To prove this theorem, we take the series for multiply it by x", 
and differentiate it term by term : 


j ( x) » y (-irs** - 

k l Q 2’+»k\T{v + k + 1) 

r M _ V (-l) fc S»’ +2fc 

* A } ~ L 2 "+ 2fc /c!r(v + /c + 1) 

d{x^J v (x)} _ v» (— l) fe 2 (v + 

dx k l Q 2 r+2k k\(v + k)T(v + k ) 

y ( — l) k x v x r ~ 1 ' ) ' yc 

” fc Z o 2- 1+2A /c!r(K - 1 + k +1) 

. v (-i)V- I+2 * 

X " Z q 2 J ’~ 1+ik lclT(v - 1+ k + 1) 

= x”J„_ i(s) as asserted. 


366 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


THEOREM 2 


djx-'J,(x)\ 

dx 


= .-x- r J r + i(x) 


PROOF Theorem 2 can be proved in essentially the same manner as Theorem 
1, but it is easier and perhaps more instructive to proceed as follows: By perform- 
ing the indicated differentiations and simplifying, we can verify at once that the 
differential equation 


d 

dx 



d(x r y) ~ 

dx 


+ x l ~ v y = 0 


is precisely Bessel’s equation of order v and is, therefore, satisfied by the particular 
solution 


y = J,(x) 


Hence, substituting, we have 

Now, using the result of Theorem 1, this can be written 

jj = -x*-Ux) 

or ^ [a: 1 —,/ „_,(:£)] = —x l ~*J r (x) 


Finally, replacing v by v -f 1, we have the assertion of Theorem 2. 


By using their definitions in terms of J„(x) and J-„(x), one 
can readily show that the Bessel functions of the second kind F„( x) 
and the Hankel functions and H f (i) (x) also satisfy the 

identities of Theorems 1 and 2. Furthermore, by arguments similar 
to those we have just used, the following theorems can be 
established: 


THEOREM 3 


THEOREM 4 

~ [ar'J,(*)] = x~ v I P+ i(x) 


THEOREM 5 

^ [*'£,(*)]■ - -x'K.-i (x) 

THEOREM 6 

~ [x-’K r (x)} = -x-*K v+l (x) 


SEC. 9.5 


IDENTITIES FOR THE BESSEL FUNCTIONS 


3 67 


Performing the differentiations in the identities of Theorems 
1 and 2, we obtain, respectively, 
x v J'„(x) + vx r ~ l J „{%) — x r J„-i(x) 
x~ v j' v (x) — vx~ p ~ t J v (x) — —x~ r J P+ i(;x) 

or, dividing the first of these by x v and multiplying the second 
by x” and solving for J' v {x) in each case, 

(1) /;(*)-/«(*)- 1 /.(*) 

(2) J'Jx) = j J ,(x) - .7, +l (l) 

Adding these and dividing by 2, we obtain a third formula for 

J'Xx): 

(3) - L±M 

Subtracting (2) from (1) gives the important recurrence 
formula 

J v-l(%) + j v+l(x) = ^ J v (x) 

Written as 

(4) J r+ i(x) = ^ J v (x) - J^ix) 

this formula serves to express Bessel functions of higher orders 
in terms of functions of lower orders, frequently a useful manipu- 
lation. Written as 

(5) /.-,(*) = §!/,(*) -JVhM 

it serves similarly to express Bessel functions of large negative 
orders (for instance) in terms of Bessel functions whose orders 
are numerically smaller. 


EXAMPLE i 

Express Ji(ax) in terms of Jo(ax) and Ji(ax). 

Taking v = 3 in (4), we first have 
6 

J.dax) — — Jziax) — Jdax) 
ax 

Applying (4) again to Js(ax) and then to J 2 (ax), we have further 


- J 2 (ax) — J i(ax) — J 2 {ax) 


) J 2 (ax) J i (ax) 


-i[i 
-(£-) 

-(»-■)[ 

-(S-i)"— (£-) 


- Ji(ax) — Jo(ax) 


— — Max) 
J o(ax) 


368 BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


EXAMPLE 2 

at, it X d[xJr(x)J v+ i(x)\ n , m 

Show that = a;[J,*(a;) — J v +i(x)} 

ax 

Performing the differentiation, we have 

l( - = J,(x)J,+i(z) + xJ' v {x)J v+ .,(*) + xJ„(x)J[ +l (x) 
ax 

Then, substituting for xJ' v (x) from (2) and for xJ' v+l (x) from (1), we have 
d[xJ v (x)J v+ ,(*)] 

= J P (x)J v+ i(x) +J v+ i(x)l»w - ^+i(*)] + Mx)lxJ f (x) - (r + 1 )/,+,(*)] 

- x[J v s (x) - Jl +l (x)l 

The basic differentiation identities of Theorems 1 and 2, 
when written as integration formulas 

(6) jx v J v -i(x) dx = x”J r (x) + c 

(7) Jx~ p Jp + i(x) dx — —x~ v J„(x) + c 

suffice for the integration of numerous simple expressions involv- 
ing Bessel functions. For example, taking v = 1 in (6), we have 
jxJ o(z) dx — xJ i(x) -f c 
Similarly, taking v = 0 in (7), we find 
JJi(x) dx = —Jq(x) + c 

Usually, however, integration by parts must be used in addition 
to (6) and (7). 


EXAMPLE 3 

What is // i(x) dxl 

If we multiply and divide the integrand by x i , we have /**[*- V a(a:)l dx, and so, integrating 
by parts with 

u = x 1 dv = x~ 2 J i(x) dx 

du = 2x dx v = -x-'Jiix) [by (7), with r = 2] 

we have /y 3 (ar) dx - -J t (x) + 2 jx-'J*(x) dx 

~ —Ji(x) — 2x~ l Ji(x) + c [by (7), with v — 1] 


EXAMPLE 4 

What is jy 2 (3a0/z ! ] dx? 

Here it is convenient to multiply the numerator and denominator of the integrand by dx 1 , 
getting 

~ f (Sx)V t (3x)~ 

Now, integrating by parts with 

u = (3z)*/s(3ar) dv - — 



du = dx 


sec. 9.5 


IDENTITIES FOR THE BESSEL FUNCTIONS 


Again using integration by parts, with 

u — 3 xJ j(3x) dv = — 

a; a 

du = 3a;/o(3a:)3 dx v — 

x 

we have further 

/ ^&-!{-45^ + a[-w,<to)+e / /<8.)d*]} 

= - — ^ - Ji(3x) + 3 [ Jo(3x) dx 
ox J 

The residual integral fJo(3x) dx cannot be evaluated in finite form. 


( 8 ) 


In general, an integral of the form 
fx m J n (x ) dx 

where m and n are integers such that m n S; 0, can be com- 
pletely integrated if m -f- n is odd, but will ultimately depend 
upon the residual integral fJo(x) dx if in + n is even. For this 
reason Jo{t) dt has now been tabulated.* 

Another class of identities of considerable interest can be 
obtained from the expansion of the function 

exp [j(' _ 0 ] =e “' !<r ''“ 

in terms of powers of t. To derive this expansion we first replace 
the exponentials on the right of (8) by their infinite series, getting 


/ V J. f v (— iy xttr[ 

VA^'2‘/L4 A ' V . 


Now when these series are multiplied together, we obtain a 
term containing t n (n ^ 0) when and only when the general term 
in the second series, i.e., the term containing t~ j , is multiplied by 
the term in the first series which contains t n+s , i.e., the term for 
which i — n + j. Therefore, taking into account all possible values 
of j, we find that the total coefficient of t n in the product of the 
two series is 


i . s ,t+, i r (-iy . v (— = r ( . 

(n +j)\ 2 n+i J [ j\ 2 { J A o 2 n+2i jW(n + j + 1) ' 

Similarly, a term containing t~ n arises when and only when the 
general term in the first series, i.e., the term containing t\ is 


* A. N. Lowan and Milton Abramowitz, “Tables of Integrals of J o(0 dt 
and J* Y a (t) dt,” MT 20, Superintendent of Documents, Government 
Printing Office, Washington, D.C. 


it 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


( 9 ) 


( 10 ) 

( 11 ) 


multiplied by the term in the second series which contains t~ n ~\ 
i.e., the term for which j = n + i. Therefore, taking into account 
all possible values of i, we find that the total coefficient of tr n in 
the product of the two series is 




(_1)»H 


*1 = 


(ft + i)\ 2 n+i J 


-D« V 

A o 2 n +*H\T(n +■ i + 1) 
-D n Mx) 


Hence, 


exp [| (t - i)] = *<*) + | 4WII- + (- 1 )*^"] 

Now let t == so that 

2 V - t) “ 2— 

and exp J-^ j = e' x ain * = cos (a: sin <£) + « sin (x sin </>) 

In the same way, when n is even, say n — 2k, we have 

Z” + (-1)»Z~" = Z 2fc + (-1) 2 *Z~ 2 * = + e“ i2 ** = 2 cos 2/c0 

and, when ft is odd, say n = 27c — 1, we have 

Z” + ( — l) n Z~ " = Z 2 *" 1 + _ e iC2fc-l)* g-»( 

= 2i sin {2k' — l)<f> 

Therefore Eq. (9) can be written 

e ixBin<t> == cos ( iT s i n «£) s i n ( x s j u ^ 

= Jq(x) + 2 ^ Jzk{x) cos 2lc<t> + 2 i Y ,hk~i(x) sin (2/c — !)</> 

ft=i &=i 

Equating real and imaginary parts in the last expression, we 
obtain the identities 


cos (x sin 4>) = J 0 (x) + 2 Y ,J ik {x) cos 2k<t> 

k=i 

sin (x sin <f>) = 2 Y J Zk -i(x) sin (2 A: - !)</> 

fc=i 

The series on the right in (10) and (11) are, of course, just the 
Fourier expansions of the functions on the left. 

Now multiply both sides of (10) by cos ft<£ and both sides of 
(11) by sin n<f>, and integrate each identity with respect to 4> from 
0 to 7r. Since 

J Q cos rn4> cos n<t>d<f> — Jj sin m</> sin ft<£ d<f> = 0 
J Q cos 2 n<f> d<j> = Jj sin 2 n(f> d<f> — ?- 


m ^ ft 


SEC. 9.5 


IDENTITIES FOR THE BESSEL FUNCTIONS 


371 


this yields 

f v . / • fir J n (x) n even 

L cos n<f> cos (x sin 4») a<b = 1 _ . , 

\ 0 n odd 

f r ■ , • / • , , |0 n even 

h sm n<t> sin (x sm <£) d<t> = j odd 

If we add these two expressions and divide by ir, we have, for all 
integral values of n, 

J n (x ) == - f * [cos n4> cos (x sin <f>) +■ sin n<t> sin (x sin $)] d(j> 

TV JO 

since, for every value of n, one or the other of the integrals vanishes 
while the remaining one contributes J n (x). Finally, using the 
formula for the cosine of the difference of two quantities, we have 

(12) «*>-;/„' cos (n$ — x sin <j>) d<t> n an integer 


EXERCISES 

1 Express / 5 (x) in terms of / o(x) and Ji(x). 

2 Express J%(x) and /_%( x) in terms of sin a; and cos x. 

3 4 Wba tia ™>- 

ax ax 

E , d[x 3 /,,.,i(x)/„ + i(x)] „ , r , , //»(x) 

5 Show that ; = 2x s /„(x) — 

ax dx 

6 Prove Theorem 2 by using the series expansion for J f {x). 

7 Show that 

a 4Jy(x) =* J v -n(x) — 2 /„(x) +J v +z(x) 
b 8/"'(x) = /„_ 3 (x) - 3/ p-i(x) + 3J v+ i(x) - /„ +3 (x) 

8 Show that J"(x) = |~ ~ 1 j /»(*) — — 

9 Show that /< 


■ w “ ; r 


cos (a; cos </>) /<#>. 


10 By expanding the integrand into an infinite series and integrating term by term, show that 


• r 


J <,(x cos <#>) cos <£ d4> = 


11 Show that JJa(x)dx — ‘2[Ji(x) + J six) + / 6 (x) + • 

12 Show that 


fir/2 1 — COS X 

/ J\(x cos <(>) dtl> = 

Jo X 

. [Hint: Use Formula (3).] 


j J 0 {x ) dx - J iix) + J - 


J x* 


- /dx) + — + ~ /»(*) + 1-3-5 f - 
XX 2 J 


= / l(x) + /dx) + ■ 


(2n - 2) !/„(x) ^ (2n)j f /»(») 

!"»! J x™ 


2»-i(n - l)!x»“ l 2' 

[Hint: Use repeated integration by parts, each time taking dv — x* +1 /*(x) dx-1 


372 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


13 


14 

16 

16 

17 

18 

19 

20 


21 


26 


r11 , f T or \ 1 n T ^ ~| 

Show that J xJ m 2 (x) dx — a: 2 — — + c 

(Hint: After integrating by parts, the result of Exercise 5 may be helpful.) 
Show that fJo(x) cos x dx = xJt(x) cos x + xJi(x) sin x + c. 

Show that J/ B (x) sin x dx = x./ B (x) sin x — xJi(x) cos x + c. 

Show that J./i (x) cos x dx — xJ i(x) cos x — J B (x)(x sin x + cos x) + c. 

Show that JJ i(x) sin x dx = xJi(x) sin x + do(x)(x cos x - sin x) + c. 

What is 

a fxJo(x) cos x dx? b /x/i(x) sin x rfx? 

What is 

a Jx/#(x) sin x dx? b fx/i(x) cos x dx? 

Show that 

a $xJ o(x) dx — xJ i(x) + c 

b jW„(x) dx = x 2 ./i(x) + x./ 0 (x) — //o(x) dx + c 
c Jx 3 /o(x) dx = (x 3 — 4x)/i(x) + 2x 2 J n (x) + c 

d JxVo(x) dx = (x 4 — 9x a )/i(x) + (3x 3 — 9x)/o(x) + 9/Jo(x) dx + c 

Show that 

a J dx - -/,(.) + J J o(x) dx + c 

b J./,(x) dx - — o(x) + c 

c ■/*/,(*) dx = —xJ o(x) + J/o(x) dx + c 

d fx 2 Ji(x) dx — 2x./j(x) — x-J a {x) + c 

e JW,(x) dx = 3x 2 «7 i(x) - (x 3 - 3x)d 0 (x) - 3/./„(x) dx + c 

f fx 4 Ji(x) dx = (4x 3 — 16x)/i(x) — (x 4 — 8x 2 )./o(x) + c 

What is JxJA 1 - x) dx? 23 What is fJo(V^) dx? 

Show that 


c /;o.)- L^w ± L l W 


b /» -'/,(*) +/,+!(*) 
d r,_,(x) - 7. + ,(x) =~/.(x) 


What is 
a fxT 0 (x)dx? 
c JxJi(x) dx? 


b /x a /o(x)dx? 
d /x 2 /i(x) dx? 


9.6 

Orthogonality of the Bessel functions 

If we write Bessel’s equation of order r in the form 

it is clear that it is a special case, with 

2,2 

p(a?) — re g(a:) = — — r(x) = x 

and X 2 written in place of X, of the general equation covered by 
Theorem 4, Sec. 8.5. If the solutions of Bessel’s equation satisfy 
boundary conditions of the form 

( 1 ) Aiy P (Kx { ) - = 0 i — 1 , 2 

ax |x « x% 


SEC. 9.6 


ORTHOGONALITY OF THE BESSEL FUNCTIONS 


373 


they must, therefore, be orthogonal with respect to the weight 
function p(x) = x over the interval (x h x^).] 

For practical purposes, however, it is not enough to know 
that the characteristic functions of a problem are orthogonal. 
In order to carry out the expansions required at the final stage 
of a typical boundary value problem, it is also necessary to know 
the value of the integral of the product of the weight function 
and the square of the general characteristic function, taken over 
the interval of the problem. 

We begin this calculation by considering the indefinite 
integral ftyY(t) dt where y r (t) is any solution of Bessel’s equation; 
i.e., 

(2) l 2 y" + tyl + (J 2 - v 2 )y„ = 0 

If Eq. (2) is multiplied by y v and then integrated, we obtain 

(3) Jt 2 yWv dt 4- ft(y') s dt + St 2 y„y v dt - v 2 $y f y v dt = 0 

Now, evaluating the first and third integrals by parts, we have 
SPyWldt— - — — 

du = 2tdt v = H(.y „')» 

SHvl dt — — ^ Vithy* - jty 2 dt 

du — 2t dt v — }iy „ J 

Then, substituting these results into Eq. (3), we find 

mW 2 - MY dty+ MYdt + (M*V - StyYdt ) 

- y 2 vw = o 

or, collecting terms and solving for Sty„ 2 dt, 

f ty,>(l)dl = ±W-S)y.Kt)+ 

If we now put t = X m .r, where X m is any one of the characteristic 
values for which solutions satisfying the boundary conditions 
exist, and then divide by X m 2 , we obtain the integral in which we 
are actually interested: 

(4) J xy„ 2 (\ n x) dx = J (X m 2 a; 2 - v 2 )y y 2 (\ m x) + a: 2 J 

The evaluation of (4) between the specific limits xi and x 2 
requires the consideration of several special cases, according as 
Bi in the boundary conditions (1) is or is not equal to zero. If 
Bi = 0, then (1) becomes simply 

( 5 ) yHmXi) =0 


and the antiderivative on the right of (4) reduces to 



f Since r(x) s x vanishes when a; = 0, it follows from the proof of Theorem 4, 
Sec. 8.5, that if an = 0, no boundary condition will be needed (and none 
will be available) at % == si. 


374 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


This can be further simplified by recalling from the preceding 
section that all solutions of Bessel’s equation «/„, Y„, 

and as well as arbitrary linear combinations .of these func- 

tions, satisfy the identity 

t = vy v {t) - ty r+ i(t) 

or x ~ = vy„(\ m x) - \ m x y P+ i(\ m x) 

Evaluating this at x — Xi and using (5), we find 

^ _ _ u , 4lM 

and so (6) becomes simply 

On the other hand, if Bi 0, then we can substitute for the 
derivative on the right of Eq. (4), getting 

The results of the preceding discussion are summarized in 
the following important theorem: 


THEOREM 1 

The solutions of Bessel’s equation of order v which satisfy the boundary conditions 

AnjAxi) - Bi L_ b ; = 0 i = 1, 2 

form an orthogonal system with respect to the weight function x over the interval 
(* i,®s). The integral of the product of^ the weight function and the square of any 
solution of the system {y„(X m £)}, i.e., 

fxt 

I xy„ 2 (\ m x) dx 

is equal to 


*s) 1 
2X m 2 1 

L(Xm^2) 

2 - P 2 + 

(*)■] 







_ h 

tdmxi) r 

2X m 2 [ 

(Kxi)* ~ 

■ p 2 + ( 

m 

B 

iB a 

* 0 

y» 2 (x m x 2 ) 1 
2X m 2 | 

[" (Kx 2 ) 

2 - p 2 + 

(¥)'] 

I Xi 2 

1 2 

y?+x(X w x i) 









Bt = 

0, 

Bt 

^ 0 

X 2 2 , / ' . 

Y Vv+lV^mXi) — 

ySfrnX l) 
2X m 2 


- >' 2 + 

m 









B 1 * 

o, 

Bt 

= 0 


Y yv+lO^mXl) 


Bx 

- 

Bt 

= 0 


If Xi = 0, no boundary condition is needed at x = Xi, and the contribution to 
the integral from the lower limit is zero. 


ORTHOGONALITY OF THE BESSEL FUNCTIONS 


Expand f(x) — 4x — x 3 over the interval (0,2) in terms of the Bessel functions of the first kind 
of order 1 which satisfy the boundary condition 

/,(juf) L- 2 = ° 

In this case the characteristic values are the values of X determined by the roots of the 
equation 

/,(2X) = 0 

Now the roots of the equation Ji{z) — 0 are* 

so = 0 si = 3.S32 a 2 = 7.016 z 3 = 10.174 = 13.324 

Hence, X„ = 0 X, = 1.916 X 2 = 3.508 X 3 = 5.087 X 4 = 6.662 

Therefore, since /i(Xox) = Ji(0) = 0, the characteristic functions in terms of which the expan- 
sion is to be carried out are 

Jifax) Ji(* 3 x) Jifax) . . . 

As in the simpler case of Fourier expansions, we begin by writing 

f(x) = 4x - s 3 - AiJi(\ix) + AiJiM + • • • -f- A m J,(_X m x) + • • • 
Multiplying both sides of this expression by xJi(\ m x), integrating from 0 to 2, and using the 
results of Theorem 1, we have 

J* (ix - x*)xJi(\ m x) dx = A m jf 2 xJ i*(X m a:) dx = 2A m JA(2\ m ) 

J Q (4x 2 — x 4 )/i(X m a:) dx 


2/ s 2 (2X m ) 


Hence A, 

For the integral 

4 J Q X*Jl(\mX)dX ~ ■— J Q (X m x) 2 / l(XmX) d(X„, x) 
we have immediately, from Eq. (6), Sec. 9.5, 


To evaluate x 4 Ji(\ m x) dx = ■— jf (\ m x)*J i(\ m x) d(\ m x) = jf t*J i(t) d 

we use integration by parts, with 


u = f 2 dv = dt 

du — 2t dt v = t\To(t) 


This gives jT 2 x 4 Ji{\ m x) dx | 2Xm - 2 jT 2X ” fV*(0 dt 


~ [X m / 2 (2X m ) - J a (2X m )] 


* See, for instance, Eugene Jahnke, Fritz Emde, and Friedrich Loseh, 
“Tables of Higher Functions,” 6th ed., p. 193, McGraw-Hill Book Company. 
New York, 1960. 


376 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


Th " a ' 

But, by Formula (4), Sec. 9.5, 


8/ a (2X m ) 
X m V a *( 2X m ) 


,/s(2Xm) = 2k> ,/2(2Xm) “ ,/l(2Xm) = ~ 7; Xm) 

since the X’s were determined by the condition that Ji(2X m ) = 0. Therefore A m can be further 
simplified to 

A - ... 16 -- 

“ X m V 2 ( 2X m ) 

The same reduction can be repeated for ./2(2X m ), since 

- -Jo( 2X„) 

Hence, finally, X. - - 

The required expansion is, therefore, 


4x — x a 



J i(X ot .t) 

X»Vo(2X m ) 


Plots showing the degree to which the first term and the first two terms of this series approxi- 
mate 4a; — a: 3 are shown in Fig. 9.7. 


FIGURE 9.7 
Plot showing 
the approxima- 
tion of a func- 
tion by the first 
two terms of a 
Bessel function 
expansion. 


v 16 Jj (1.916a:) 

(1.916) 3 J 0 (3.832) 



(1 1 

2 

X 

jy lGc/id-Biex) 

16Jj(3.508x) 


(1.916) 3 </ 0 (3.832) 

(3.508) 3 c/q(7.Q16) 




EXERCISES 

1 Expand /(a;) = 1 over the interval (0,3) in terms of the functions J o(X m aO, where the X’s are 
determined by J 0 (3X) =0, 

2 Expand f(x) = 1 over the interval (0,3) in terms of the functions Ji('Knx), where the X’s 
are determined by 7 2 (3X) = 0. 


SEC. 9.7 


APPLICATIONS OF BESSEL FUNCTIONS 


377 


3 Expand f(x) = x over the interval (0,2) in terms of the functions /i(X m x), where the X’s 
are determined by Ji(2X) = 0. 

4 Expand f(x ) = x 2 over the interval (0,3) in terms of the functions Jofrmx) where the X ! s 

, . . , , dJo(\t) I 

are determined by = 0, 

dx \x = 3 

5 Expand f(x) = x 2 over the interval (0,1) in terms of the functions /»(X w x), where the X’s 
are determined by J 2 (X) = 0. 

6 Expand 

fix) = 

in terms of the functions J’i(Xmar), where the X’s are determined by 
dJ i (Xx) J ^ 0 

7 Expand f{x) = 1 over the interval (0,3) in terms of the functions J o(X ro x), where the X’s 
are determined by 

I .0 

dx I* “3 

8 Using tables of the Bessel functions, compute the first two characteristic values in Exercise 7 
correct to two decimal places. 

9 If the boundary conditions in Theorem 1 are of the form y„(Xx) = Oat x = 1 and at x — 5, 
what is the equation satisfied by the characteristic values J Xm ) ? 

10 Does Theorem 1 have a counterpart for the modified Bessel equation? Why? 


x 0 < x < 1 

0 1 < x < 2 


Applications of Bessel functions 

Bessel functions occur in a great many practical problems. In 
principle they are always to be expected when partial differential 
equations are applied to configurations possessing circular sym- 
metry. On the other hand, they also arise in numerous applica- 
tions where neither circular symmetry nor partial differential 
equations are involved. In this section we shall conclude our 
treatment of Bessel functions by discussing a variety of problems 
where their use is required. 

EXAMPLE I 

What is £{FJ v (\t) \ if v ^ 0? 

It is possible to determine the required transform by expressing as an infinite series 

and then taking the transform term by term. However, it is more instructive to proceed as 
follows: 

From Corollary 1 of Theorem 1, See. 9.4, it is clear that y — t f J„(\t) is a solution of the 
differential equation 


that is, 


(tl-iVyJ + \Hl~2Vy ■'.« 0 

ty" + (1 - 2v)y' + \Hy = 0 


378 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


If we take the Laplace transform of this equation, recalling Theorem 7, Sec. 7.4, we obtain 

- (s ! £{i/} - sy 0 - y*) + (1 - 2»»)(s.e{t/} - y 0 ) - \ s -^-£{y) 

as as 

= _ S *^M -2 s£{y) + 2 /o + (1 — 2 j>)(s£ ( j/l - y 0 ) - 

ds ds 

- -(s* + X *) - (1 + 2v)s£ ( 2 /) + 2vyo = 0 

ds 

Now, if v ^ 0, the term 2vya vanishes identically, because either v — 0 or else 
y 0 = t v J r (\l) |^ o =0 

Hence, the last equation reduces to the separable differential equation 

d£(yi 

£{2/1 ' 

Integrating this, we have 


and, therefore, £{?/} s ■fifWXi)! - - - - 

To determine c we consider the leading term on each side of the last equality : 


'KbS» -•)!- 


2T( V + 1) 


4S{£ 2 " - 


X” 


2T(„ + 1) 

Hence, since this must be an identity, we find 
XT(2„ + 1) 


(«* + X 8 )^*’')' 3 


and so 

(1) 


2*r( v +i) 

£{t'Jp(M)l = - 


XT(2y + 1) 


v £ 0 


2 "r(»- + i)(s s + x*)< s >-+»'* 

Numerous other transform formulas can be obtained from (1). For instance, since, from (1), 

dJpjU) _ 


£ [J o(X<) | = 


it follows that 


Vs* + X* 


and 


dt 


-X<7 i(X2) 


X dt 


[s£{Jo(Xf)) “ J o(0)] 
1 ( Vs 2 + X 3 ' - 


a ( f iY_i/ y.-+xi-A 

X \V S* + X* / * \ V s* X 2 / 


Vs* + X* (s + Vs * + X*) 
Other results will be found among the exercises. 


1 


APPtiCATIONS OF BESSEL FUNCTIONS 


EXAMPLE 2 

A uniform, perfectly flexible cable of length l and weight per unit length w hangs by one end 
from a frictionless hook. At t = 0, while the cable is at rest in a vertical position, a uniform 
horizontal velocity v is imparted to the portion of the cable between x = 0 and x = al (Fig. 9.S). 
Find the expression describing the subsequent motion of the cable. 

This is essentially the problem of the vibrating string discussed in Sec. 8.2 except for one 
important difference. Here, instead of being constant, the tension at a general point of the cable 
is equal to the weight wx of the portion of the cable below that point. Hence, in this case Eq. ( 1 ), 
Sec. 8.2, becomes in the limit 


vj^y __ 

r df 2 




dx 


As usual, we assume a product solution y = X(x)T(t) and attempt to separate variables. 
Then, substituting, we have 

(xX'Y = T^ 

X gT 


T"X = gT(xX')' 


The common value of these two fractions must be a negative constant, say — X 2 , for otherwise T 
will not be a periodic function, as we know it must be. Hence, 

T = A cos X -\/ g t -j- J3 sin X s/ g t 

and 

(2) (xX')' + A 2 X = 0 

Using Corollary 1 of Theorem 1, Sec. 9.4, the solution for X is found at once to be 



— £ 
v O 
FIGURE 9.8 
A hanging 
cable acted 
upon by a 
transverse force 
over a portion 
of its length. 


A r = C/o(2A s/x) + D Fo( 2A \A) 

Since the displacement of the free end of the cable will obviously be finite, 
whereas F»(2X \/~x) becomes infinite as x approaches zero, it is clear that D 
must be zero. Moreover, for all values of t, y is zero when x — l. Hence 
X{1) = 0; that is, 

(3) ./o(2X VI) = 0 

This, of course, is the frequency equation of the system. It has infinitely 
many roots, 

2X s/~l = 2.404S, 5.5201, 8.6537, . . . 

and so the natural frequencies of the cable, namely, oi n = X„ y/ g, are 
= 1.2024 \Z~gjl, o. 2 = 2.7600 s/g/l, o> 3 = 4.3268 VVA 
We have now been led to an infinite sequence of product solutions, 
y m (x,t ) = X m (x)T m (t) = / o( 2X m x)[A m cos(X m s/g t) + B,„sin ( \ m s/g f)l 
None of these by itself can satisfy the given initial conditions, namely, 
y{x, 0) - o 


ay I 

at ko 


^f(x) 


'[ 0 


0 < X < al 
al < X < l 



BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


Hence, as usual, we form an infinite series of the individual product solutions, 

(4) y(x,t) - ./ 0 (2A m V x)[A m cos (X m \/gt) + B m sin (X m V g 0] 


and attempt', to make it fit the initial conditions. 

Now Eq. (2) with its accompanying boundary condition X(l') = () meets all the conditions 
of Theorem 4, Sec. S.5. Hence, the X’a are orthogonal with respect to the weight function 
p(x) m 1 over the interval (0,1), and thus the d’s and B’a can be determined by the familiar 
generalized Fourier procedure. To find A m we put t — 0 and y — 0 in (4), getting 

0=2 A m J<t(2\ m y/x) 

from which it is obvious that 

A m = 0 m = 1, 2, 3, . . . 

dy I 

To find B m we differentiate (4) with respect to t 'and then put i ® 0 and — ** /(®)i getting 

(5) f(x) = 2 yfg X m J3 m /o(2X m Va) 


Next, we multiply (4) by Jo('2\ m V x) and integrate from Oto l. From the orthogonality of the 
J o’s, every term on the right but one becomes zero, and we have 

Jq f(.x)J o(2X,ii -\/ x) dx s= jf vJ 0 (2\ m y/x) dx = y/ g \ m B m J Q J o 2 (2X„, y/ x ) dx 
v f J o( 2X m \/ x) dx 

or B m = ~V° 7i 

V 9 J 0 j o 2 (2X m V *) da; 

To evaluate these integrals we make the obvious substitutions a; = « 2 and dx = 2u du, 

fy/cil 

v I uJo(‘2\ m u) du 

getting B m = 

y/ g\ m uJ o 2 (2X m w) du 

The integral in the numerator is precisely 

uJ i(2X m u) I VS = Vs J i(2X m y/gj) 

2X,„ |0 2X m 

Because of the condition (3), the value of the integral in the denominator is, as we showed in the 
proof of Theorem 1, Sec. 9.6, 


UiK 2\ m y/l) 


Hence, finally, 


*m a V gZ J 1 2 <2X m Vi) 


With and determined for all values of m, the solution is now complete. 

It is interesting to note that, since the X’s are incommensurable, there are no two times 
when the terms sin y/g \ m t are respectively the same. Hence the cable never returns to a posi- 
tion coinciding exactly with an earlier one unless it is vibrating in one of its normal modes, that 
is, unless all but one of the B m ’ s are zero, which cannot happen for the given fix). This is in 
sharp contrast to the behavior of the string stretched under uniform tension, which repeats any 
configuration exactly after intervals of 21/ a, where a is the propagation velocity for the string. 


SEC. 9.7 


APPLICATIONS OF BESSEL FUNCTIONS 


381 


EXAMPLE 3 


A metal fin of triangular cross section is attached to a plane surface to help carry off heat from 
the latter. Assuming dimensions and coordinates as shown in Fig. 9.9, find the steady-state 
temperature distribution along the fin if the wall temperature is u w and if the fin cools freely 
into air of constant temperature u 0 . 


FIGURE 9.9 
A portion of a 
triangular cool- 
ing fin attached 
to a flat wall. 



We shall base our analysis upon a unit length of the fin and shall assume that the fin is so 
thin that temperature variations parallel to the base can be neglected. Now, consider the heat 
balance in the element of the fin between x and x + Ax. This element gains heat by internal flow 
through its right face and loses heat by internal flow through its left face and also by cooling 
through its upper and lower surfaces. Through the right face the gain of heat per unit time is 


Area X thermal conductivity X temperature gradient 


IT / bx\ du~\ f bkx du\ 

\\ a / dx J* + a* [_ a di-Jan 

face the elen 
[" bkx du j 
[_ a dxj* 


Through the left face the element loses heat at the rate 


Through the surfaces exposed to the air the element loses heat at the rate 

Area X surface conductivity X (surface temperature — air temperature) 

. , . . 2 fi(u — Mo) Ax 

1 h(u - Mo) = 


Writing this as 


state conditioi 

[ bkx du 1 f bkx du~\ 

a dx Ji+di [_ a dx J* 


Under steady-state conditions the rate of gain of heat must equal the rate of loss, and thus 
we have 

2h(u — mo) Ax 


+ - 


[x(du/dx)] x+ A x — [x(du/dx)] x 


2 ah 


Ax bk cos 6 

and letting Ax — ► 0, we obtain the differential equation 
d(xn') 2 ah 


(u — Mo) = 0 


dx bk cos 6 
U — u — Mo and 


(u — Mo) = 0 


2 ah 

bk cos 0 


382 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


this becomes 


d(xU') 


— atU — 0 


This can be solved immediately by means of the corollary of Theorem 1, Sec. 9.4, and we have 
U = u - M 0 = ci7'o(2ffi s/x) + CiKq( 2ot sfx) 

Since iCo(2a s/ x) is infinite when x = 0, c t must be zero, leaving 
W - «o == Ci/a(2or ■>/») 

Furthermore, u = u w when x — a; hence, 


= ctl 0 (2a s/ a) or ci = 


Therefore, 


U = Mo + (Uu — «o) 


U 2 a VZ) 

Io(2a “%/ a) 


h(2 a Va) 


EXAMPLE 4 

A solid consists of one-half of a right circular cylinder of radius b and height h (Fig. 9.10). The 
lower base, the curved surface, and the vertical plane face are maintained at the constant tem- 
perature u = 0. Over the upper base the temperature is a known function of position f(r,9). 
Assuming steady-state conditions, find the temperature at any point in the solid. 

Because of the nature of the boundaries of the solid it will be highly inconvenient to use the 
heat equation in the cartesian form in which we derived it in Sec. 8.2. Instead, we use it as 
expressed in cylindrical coordinates by means of the change of variables 
a: = r cos 0 


namely, 


r sin 6 z 

+ 4. 1 i _ 2 

;• Q r r 2 302 3^2 


du 


or, more specifically, for steady-state conditions, under which - 
d >u n 


( 6 ) 


dhi l£u Jirj: 
dr* + r dr + 7* 30* + 3 


Our first step is to assume a product solution u(r,0,z) = 
into (6) in an attempt to separate the variables. This gives 


R(r)Q(0)Z(z) and substitute it 


R"QZ + - R'QZ + - RQ"Z + RQZ" 


FIGURE 9.10 
A half cylinder 
in which heat 
flow occurs be- 
cause of surface 
temperature 
conditions. 



SEC. 9.7 


APPLICATIONS OF BESSEL FUNCTIONS 


383 


or, multiplying by ?- 2 and dividing by RQZ, 

r‘R" IV _ 6" _ 

R + r R + r z e ~ 141 

where the common value hi is necessarily a constant, since the variables appearing on the respec- 
tive sides of the equation are independent of each other. 

If m < 0, say hi ~ ~ A then — - = v 2 and 
(7) 0 = A cosh i >8 + B sinh vB 

Now, by hypothesis, 

w.(r,0,2) = R(r)G(0)Z(z) = 0 and u(r,T,z ) == R(r)Q(ir)Z(z) = 0 
and these can hold for all values of r and z only if 0(0) = 0 (t) = 0. From (7) we see that the 
condition 0(0) = 0 will be satisfied only if A = 0. To satisfy the condition 0(7r) = 0, it is 
necessary that 

B sinh vir = 0 

which, since v 0, is possible only if B = 0. Thus, the possibility jm < 0 leads only to a trivial 
solution and, hence, must be rejected. 

If /u = 0, then 0" = 0 and 

© = A + Be 

Again imposing the conditions 0(0) = Q(tt) = 0, we find, as before, that A = B = 0. Hence, 
the possibility m — 0 must also be rejected/since it leads only to a trivial solution. 

Finally, if a n > 0, say au = v 2 , we have 

0 " 

— = — v 2 and 0 == A cos i>6 -f B sin vO 

For this to vanish when 8 = 0, we must have A — 0. For it to vanish when S = ir, it is necessary 
that 

B sin vir = 0 

Since we cannot permit B to be zero, because that would lead again to a trivial solution, we 
must have 

sin vir = 0 

Hence v = 1, 2, 3, . . . 

and so for 0 we have the family of solutions 
0„( d) = sin nd 

With hi now known to be n 2 , the differential equation for R and Z becomes 



Z" n 2 R" 1 R' 

or, rearranging, — = hi 

where, again, since r and z are independent variables, it follows that the common value hi must 
be a constant. 

If hi < 0, say hi — ~ X 2 , we have 

If + r ' 1“ “ X ’ ~ ^ " 0 or r * R " + rR ' ” (X3r2 + n * )R = ° 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


384 

which is precisely the modified Bessel equation. Hence, 

R = CI n (\r) + DK n (Xr) 

Now K n ( Xr) is infinite when r = 0; hence, to keep the temperature finite on the axis of the 
cylinder, it is necessary that D = 0. Also, by hypothesis, 

u(b,8,z) = R(h)Q(8)Z(z) = 0 
Hence, R(b) m C7 n (X&) = 0 

But the modified Bessel function I n is never zero except possibly at the origin. Therefore, the 
last condition can hold only if C = 0. But with C and D both zero, the solution is trivial, and 
so the possibility that m < 0 must be rejected. 

If fi 2 — 0, then 

—■ + r ' W ~ ^ = ° ° r ^ + rR ' ~ n * R==0 

This is not a Bessel-type equation, but is instead an example of the Euler equation (Example 3, 
Sec. 2.6). By the usual change of independent variable 

r *= e v or v — In r 
d*R 

it becomes n 2 R — 0 

du‘ 

so that R - Ce™ + De~™ - Cr n + Dr ~ » 

To keep the temperature finite on the axis, where r = 0, it is necessary that D — 0. To keep the 
temperature zero when r = 6, it is necessary that 

0 - Cb” 

which will be the case only if (7 = 0. This means that again the solution is trivial, and m = 0 
must also be rejected. 

Finally, if > 0, say m — X s , we have 

— + ^ ■ -g + X 2 - — * 0 or r 2 R" + rR' + (XV 4 - n*)R » 0 

and R = CJ n (Xr) + DY n (Xr) 

Since Y„(\r) is infinite when r = 0, we must have D — 0. To keep the temperature zero on the 
curved surface of the cylinder we must have 
R(b) m CJ n (Xb) => 0 

Since C = 0 leads to a trivial solution, it is thus necessary that 
J n (Xb) m 0 

that is, X is restricted to the set of values 

T 

where p nm is the mth one of the roots of the equation J n (x) = 0. Thus, for every value of n, 
there are infinitely many particular solutions for R, namely, 

Rn m (r) « J„(X nm r) 

Now that we know that m = X 4 m , it is an easy matter to solve for Z, and we have 
Z" 

“ = x nm and Z - R cosh X„ m z -j- F sinh X nm z 



SEC. 9.7 


APPLICATIONS OF BESSEL FUNCTIONS 


385 


Since u(r,0, 0) = R(r)&(0)Z( 0) = 0, it follows that Z( 0) = 0, from -which we conclude that 
E = 0- The solution for Z, associated with R nm is, therefore, 

Z nm (z) — sinh X„ m z 

For each n we therefore have infinitely many product solutions consisting of the same factor 
0(0) = sin nd multiplied by the product of any pair of corresponding R’ s and Z’s: 

Unm = A nm J n (X„ m r) sinh \ nm z sin nO 
In other words, we have a double array of product solutions, 

Mu, Mis, Ui3, . ■ ■ , Ulm, ... 

Mai, Msa, m» 3, . . . , M»m, . . . 

Uni, U„t, Un 3 , . . . , U nm , ... 


Since none of the product solutions by itself is capable of representing the given temperature 
distribution f(r,8) on the upper base, it is necessary that we construct an infinite series of the 
■Unm’s and try to make it fit the temperature condition when z — h. To build up a series for u 
we first add up all the product solutions associated with a particular value of n, getting 


Unm — sin nO ^ A „,»/„( X„ m r) sinh \ nm z 


This, of course, amounts to forming the sums of the elements in each of the rows in the above 
array. Next, we add up all these series for every value of n: 

(8) u(r,8,z) — Y u n — I I sin nO Y A nm Jn(\„ m r) sinh \ nm z 

n = l n = l L 7n=l J 

The final step now is to determine the ri’s so that this double series will reduce to /(r,0) when 
z = h: 

(9) /(r,6) = V j sin nd ^ A nm J n (\ nm r) sinh X nra /tl 

71= 1 L 771 = 1 J 

To carry out this expansion, let us imagine that r is held constant and that 0 is allowed to 
vary over the range of the problem (0,tt). Under these conditions the inner sum in (9) is effec- 
tively a constant depending on n, say (?„, or more explicitly That is, 

f(r,0) = ^ G n sin nO 

71 = 1 

But the determination of the G’s is a familiar problem! In fact it is nothing but the Fourier 
sine-expansion problem, and we can write immediately 

(10) G n m G n (r) m- Jj f(r,0) sin n8 d8 

Thus G n (r) is a known function of r. But, by definition, G n (r) was the inner sum in (9); that is, 
Gn(r) = £ (Ann, sinh K m h)Jr.(K m r) 

Hence, it is clear that the ri’s must be such that the products A nm sinh X nm h are the coefficients 
in a Bessel function expansion of the now known function G n (r). Hence, from the theory of the 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


e can write A nm sinh \ nm h = 


last section, recalling that the X’s were determined by the condition 
J n (bb) = 0 

f Q r<7„(r)./ n (X nm r) dr 

' ~ (& 2 /2 yi +1 (K^r~ 

Therefore, Amm , dr 

(6 2 /2) sinhT rlm T Jl + AX~, lm b ) 

ptS;T by(,w 

EXERCISES 

1 What is £{f/ 0 (X<)}? 

2 What is £ |/ 2 (x<) J ? Hint: Recall from Eq. (3), Sec. 9.5, that 
9 


= Jo(\l) - 2 


d(Xt) 


S What is £ { J n (xt ) } ? 

Show that J Q J o(M) dt - -• [Hint: Consider the integral defining the Laplace transforn 
of Jo(\t).] 

5 What is (a) £ j m dt? (b) JJ tM u) dt ? (c) f* ^(xi) dt? 

6 What is £-i / L L 

(V« 2 + 4s +l3j ‘ 

7 What is £-i I L ) ? 

( (s + 'd) Vs 2 + 6 2 j 

9 What is (a) £{tl 0 (xt) ) ? (b) £ (f/^xt) J ? 

11 What is £-i 

12 


8 Show that £ {/ 0 (Xf) } . 
10 What is £(/,(Xt)|? 


Vs 2 - X 2 


vV«(s -l)f 

Show that ^ J 0 (x)J 0 ( t ~ X) dX = sin <. (Hint: Recall the convolution theorem.) 


13 Show that I 0 (t) = — f l _ e ' X x . 

T ■'° VxoT^’x) X ' Hmt: Combme Formula 4, Sec. 7.3, for the 

result of Exercise 8.) T1>eorem 6 ’ Sec - 7A > and then apply the convolution theorem to the 

14 Find the solution of the equation - /.<*> for which * - - 0. Hence show that 

u»L e B m , , XW ‘ W ^ * te hy LaplMa -ethods 

using Eq. (1) and also the convolution theorem.] 

15 What is J q sin (« _ X)/,(X) dX? 

16 Derive Formula (1) bv exnresKmo- » r aa 

transform term by term. ' “ " mfi “ le Kriea “d ‘“king the Laplace 

17 Showthal£|J, (2v 4)| -L- 1 '..Wh a tis^. k -„. — , |? 

tfon in * Cm ‘ m ‘ ^ 

(at + b)y" + ( c t -f d)y' + (et +f)y = 0 
if * - be . 2a*l and o/ - k, . <*, whe „ k * a Mgative integer 



SEC. 9.7 


APPLICATIONS OF BESSEL FUNCTIONS 


387 


19 


20 


21 

22 


28 

24 

26 


27 


80 


31 

32 

33 

34 


Show that the function 


- f’ 


e - 2 co 8 h e fig satisfies the differential equation 


Z<f>" + $’ — Zip 


0 


Hence show that <t>(z) is of the form CKq(z). 

In Example 3, verify that all the heat that enters the fin is lost from its surface. What frac- 
tion of the heat entering the fin is lost from the section between x — 0 and x — a/2 ? 
Work Example 3 given that the fin is of rectangular cross section. 

Show that the radial temperature distribution in a thin fin of rectangular cross section and 
outer radius R which completely encircles a heated cylinder of radius r satisfies the differ- 
ential equation 



where x is measured radially outward from the center of the cylinder and the other param- 
eters have the same significance as in Example 3. 

Solve the differential equation of Exercise 22, and find the temperature distribution in the 
fin if the cylinder temperature is ?/„. 

Work Exercise 22, given that the fin is of triangular cross section. 

Find the first two natural frequencies of a steel shaft 20 in. long vibrating torsionally if the 
shaft is built-in at one end and free at the other and if the radius of the shaft at a distance x 
from the free end is r (x) = (x/20)K Steel weighs 0.285 lb /in. 3 , and its modulus of elasticity 
in shear is E a — 12 X 10 s lb/in. 3 

An elastic string whose weight per unit length is w 0 (l -f- ax), where x is the distance from 
one end of the string, is stretched under tension T between two points a distance / apart. 
Find the equation defining the natural frequencies of the string. 

A body whose mass varies according to the law m(t) = >» 0 (1 + at)~ l moves along the x-axis 
under the influence of a force of attraction which varies directly as the distance from the 
origin. Determine the equation of motion of the body if it starts from rest at the point 


Work Exercise 27 if the force is directed away from the origin. 

The lower end of a long thin rod of uniform cross section is clamped so that the rod is 
vertical. Determine the values of the parameters of the rod for which buckling will occur 
if the upper end of the rod is displaced slightly from its neutral position. [Hint: Choosing 
axes as in Example 1, Sec. 2.6, the problem can be solved by using the relation (Ely")' = V, 
where V is the transverse component of the weight of the portion of the rod above a general 
point x, or by using the relation (Ely")'' — ~w, where w is the transverse component 
of the weight per unit length of the rod at the point x.J 

A cantilever beam of length l and breadth b has its upper surface horizontal. The depth 
of the beam varies directly as the cube root of the distance from the free end. An oblique 
tensile force F, whose direction makes an angle 0 with the horizontal, acts at the free end 
of the beam. Find the equation of the deflection curve of the beam. 

Work Exercise 30 if the force is an oblique compressive force. 

A cantilever beam of length I and breadth b has its upper surface horizontal. The depth 
of the beam varies directly as the two-thirds power of the distance from the free end. If 
the beam bears a uniform load of w lb per unit length .and is acted upon by a pure tensile 
force F at its free end, find the equation of the deflection curve of the beam. 

A bar has the shape of a truncated right circular cone of length l, the radii of its bases being 
r and R. Find the frequency equation for the torsional vibrations of the bar, assuming both 
ends of the bar free. 

Determine the limiting form of the frequency equation in Exercise 33 when r—> II. Check 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 



388 

by comparing your result with the frequency equation derived directly for a uniform bar. 
(Hint: Express the Bessel functions interns of sines and cosines.) 

SB Determine the natural frequencies of a uniform circular drumhead. 

36 Find the frequency equation for the transverse vibrations of a cantilever whose -width is 
constant but whose depth varies directly as the distance from the free end. (Hint : To solve 
the differential equation defining the normal modes of the beam, recall Exercise 8, Sec. 9.4.) 

37 Find the frequency equation for the transverse vibrations of a cantilever beam which is a 
solid of revolution whose radius varies directly as the distance from the free end. 

38 Work Example 4 if the curved surface of the solid is perfectly insulated. 

39 The lower base and curved surface of a right circular cylinder of radius b and height h are 
maintained at the constant temperature u ~ 0. Over the upper base the temperature is a 
known function of position f(r,6). Assuming steady-state conditions, find the temperature 
at any point in the cylinder. 

40 A thin circular plate has its upper and lower faces insulated against the flow of heat. One 
half of its circumference is maintained at the constant temperature 100 3 ; the other half is 
maintained at the constant temperature 0°. Find the steady-state temperature distribution 
in the plate. 

41 The region between two concentric circles of radii n and r 2 is initially at a uniform tempera- 
ture of zero. At i =* 0 the temperature around the entire inner boundary is suddenly raised 
to 100°. Find the temperature at any point in the region at any subsequent time if the 
outer boundary is maintained at the temperature zero. 

42 A right circular cylinder of radius b and height h has its upper and lower bases maintained 
at the temperature Q 6 . The curved surface of the cylinder is maintained at the temperature 
distribution u{b,z) — f(z). Determine the steady-state temperature distribution throughout 
the cylinder. 

43 A right circular cylinder of radius b and height h has its lower base maintained at the con- ' 
stant temperature 0°. Over its upper base the temperature distribution u(r,h) — f(r) is 
maintained. If the curved surface cools freely into air of constant temperature 0°, find the 
steady-state temperature distribution within the cylinder. 

44 A two-dimensional region having the shape of a quarter of a circle is initially at a uniform 
temperature of 100°. At f = 0 the temperature around the entire boundary is suddenly 
reduced to zero and maintained thereafter at that value. Find the temperature at any 
point of the region at any subsequent time. 

45 Find the steady-state temperature distribution in a two-dimensional region having the 
shape of a quarter of a circle if the curved boundary and one of the radial boundaries is 
maintained at the constant temperature 0° and the other radial boundary is maintained 
at the constant temperature 100°. 


9.8 

Legendre polynomials 

In Example 4, Sec. 9.7, in solving the steady-state heat equation, 
i.e., Laplace’s equation, in cylindrical coordinates, we found that 
one of the ordinary differential equations arising from the separa- 
tion of variables was Bessel’s equation. In very much the same 
way, it turns out that, when w r e apply the method of separation of 
variables to Laplace’s equation in spherical coordinates, one of 


the ordinary differential equations which results is Legendre’s 
equation. 


SEC. 9.8 


LEGENDRE POLYNOMIALS 


389 


If the expression 
y _ d 2 F d 2 F 
1 ~ dx 2 + dy 2 


dW 

dz 2 


is transformed from cartesian coordinates to spherical coordinates 
by means of the relations (Fig. 9.11) 



/ . d 2 F dF d 2 F dF 

Q M’ 2 sm 0-^ + 2 r sm 0 ~ -f sin Q -— 2 + cos 0 — -f - 


dr 


( 1 ) 


we obtain, after a lengthy but straightforward reduction, 

1 d 2 F\ 

30 2 ' wo " 86 1 sin 9 d<t> 2 ) 
Hence, when Laplace’s equation V 2 F - 0 is expressed in spherical 
coordinates, it becomes 

„ • a **F . „ a dF , . a d 2 F , a dF , 1 d 2 F 
r Sln * *5 + 2r sm e - + sm 9 w + cos « ^ w 

Any solution F(r,B,4>) of this equation is known as a spherical 
harmonic. 

In an attempt to solve Eq. (1), we assume a product solution 
F(r,e,4>) = R{r)G(6,<f>) 


0 


Then, substituting this into (1), we have 

r 2 sin 9 R"G + 2r sin 6 R'G + sin OR + cos 0 R^~r + = 0 

oB i do sm 6 o<p~ 

or, dividing through by RG sin B and rearranging, 
r 2 R" + 2rR' _ / 1 d 2 G cos 6 a<? 1 d 2 G\ 

R \G 89 2 + G sin 6 dd + G sin 2 9 dtf) 

This relation can hold only if the common value of these two 
expressions is a constant. For later convenience we write the 
constant as n(n + 1); hence, we are led to the two equations 
r 2 R" + 2 rR' — n(n + 1 )R = 0 


(2) 


390 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


(4) 

(5) 


( 6 ) 


The first of these equations is an instance of Euler’s equation 
(Example 3, Sec. 2.6), and it is easy to verify that its complete 
solution is 

R = c * r " + 

Solutions <7(0, $) of the second equation, which we will have to 
find by a further separation of variables, are known as surface 
harmonics. 

If, in (3), we substitute G(6,4>) — 0(0)<f>($), we find 

e"$ + e'$ + -r-Vs e$" + <n + i)e$ = o 

sm 0 sin 2 0 v 

or, dividing by 0$/sin 2 0 and rearranging slightly, 

sin 2 0~ + sin 0 cos 0~ + nin + 1) sin 2 0 = — %- 

Again, the common value of the two members of this equation 
must be a constant, say ra 2 ; thus, we have the pair of equations 

+ ?n 2 3> = 0 

sin 2 0 0" + sin 0 cos 0 0' + [n(n 4- 1) sin 2 0 — m 2 ]0 = 0 

The first of these equations is completely familiar, and its complete 
solution 


$ =* c 3 cos vuj) -f Ci sin m<£ 

can be written down at once. The second equation is known as the 
associated Legendre equation,* although it is usually studied in 
the form obtained by setting x — cos 0. 

If x = cos 0, then 
dQ dO dx . ' dQ 

dd dx dd ~ Sm dx 


d 2 0 d / . a dQ\ n dQ . n d*0 dx 

TW = de{- sme T x ) - - cos e S> ~ sm *1*76 

n de , . , „<i ! e 
= _ cose _ + sm! ,_ 

Hence, substituting these expressions into (5), we obtain the 
equation 

sin= «(- «» «g + sin’ + 8in 6 cos siB eg) 

-f- [n(n + 1) sin 2 0 — m 2 ]0 = 0 
or, dividing out sin 2 0, substituting x = cos 0 in the coefficients, 
and simplifying, 
x d 2 6 


(1 - a: 2 ) 


dx 2 


- 2x ^+ r 

2x Tx + { 


n(n + I) — 


After the French mathematician Adrien-Marie Legendre (1752-1833). 


SIC. 9.8 


LEGENDRE POLYNOMIALS 


391 


This is the algebraic form of the associated Legendre equation. If 
m = 0, that is, if the solution of the original problem is inde- 
pendent of the longitude angle <f>, then Eq. (6) reduces to 


(7) 


(1 


d 2 e 

■ x) d&- 


dQ 


which is known simply as Legendre’s equation. 

To solve Eq. (7) we use the method of Frobenius and assume 
a series solution of the form 


8 (a;) == a: c (a 0 + a x x + a z x 2 + • • • + akX k + 
Then substituting into Eq. (7), we have 


•) a 0 5 ^ 0 


aoc(e - + a,( c + + a 2 (c + 2)(c + l)x« + 

— aoc(c — l)x= — 

+ n(n + l)aox« + 


fc+ 2 (c + * + 2)(c + h + l)x» + ‘ + • 
- Me + *)(c + * - l)x«+* ~ • 
— 2 a*(c + *)i c+t — • 
+ n(n + 1 )akx‘ +t + • 


( 8 ) 


For this to be an identity, it is necessary that 
aoc(c — 1) = 0 cq(c + l)c = 0 
and, in general, 

flit+sCc + k + 2)(c + k + 1) — a*[(c + k)(c + k + 1) — n(n + 1)] = 0 
If we take c = 0, both a 0 and a x remain arbitrary, and we have for 
the general recurrence relation 
_ (ft — Js)(n -f" k "4* 1) 

Uk+ 2 (k + l)(k + 2) f 

Specifically, from (8), 

: do Cti — di 

(n-l)(?i + 2) 
t3 fll 

(ft - l)(n - 3) (ft + 2) (n 4- 4) 

18 = 5l * 


; — 0, 1, 2, , 


__ ft(ft + 1) 


2! 


. n ( n ' 


2) (ft + l)(ft + 3) . 

4! 1 


Hence a complete solution of (7) can be written 
(9) e(s) - „„ [l - x> + B(n - 2)(n + 1)(Tt + 3) ] 

, (ft- l)(n-3)(n + 2)(n + 4) . . .] 

+ si * J 

These infinite series define what are known as Legendre functions 
of the second kind. Since x = ± 1 are the only singular points of 
the differential equation (7), it follows from Theorem 1, Sec. 9.1, 
that the radius of convergence of these series is 1. It can be shown, 
however, that neither series converges at either of the end points 
x = +1; that is, the interval of convergence for each series is 
— 1 < x < 1. 


( 10 ) 


BESSEL FUNCTIONS AND LEct e/votf poLYNOM^ 1 - 5 


In many applications tire parameter n is a positive integer. 
If it is odd, then, clearly, the second series in (9) contains only a 
finite number of terms; if it'lseven, then the first series contains 
only a finite number of terms. In either of these cases the series 
which reduces to a finite sum is known as a Legendre polynomial 
or zonal harmonic of order n. To obtain a standard form for the 
Legendre polynomials it is customary to multiply the finite sums 
occurring in. (9) when n is an integer by the appropriate one of the 
following factors: 




(n - 1) _ (•—! )»/*»! 


(~D 


(n - 1) ' 


(_l)(n-l)/2( w + !)! 


P»(x) 


This leads to the general formula 
(-l)*(2n - 2fe)l 


1 . 


2 *fc!(« - k)l(n - 2k ) ! 


2 

n — 1 


i odd 


n odd 


Specifically, we have 
Po(x) = 1 

P*(x) = V 2 (3 x 2 - 1) 

P*(x) = M(35x* - 30.c 2 + 3) 


Pi(x) = x 

Pz(x) = 14 (ox 3 - 3x) 

Pm = l A(G3x 6 - 70. r 1 + lor) 


As these particular results illustrate, P„(l) = landP„(— 1) = ( — l) n 
for all values of n. Since the infinite series in (9) diverge 
when x = +1, it is clear that to within an arbitrary constant 
multiplier, P n (x) is the only solution of Legendre’s equation which 
is finite on the closed interval — 1 <£ x ^ 1. 

One of the fundamental identities involving Legendre poly- 
nomials is Rodrigues’ formula :* 


THEOREM 1 


Pn(x) 


1 dm - 1 )" 


2 "n! dx n 

PROOF This result can be proved by direct differentiation and induction, but 
it is perhaps more interesting to proceed in the following way. If we let 

v = (x 2 - 1)* 


then 


dv 


— 2wx(x 2 — l)"~ l 


* Named for the French economist and mathematician Olinde Rodrigues 
(1794-1851). 


SEC. 9.8 


LEGENDRE POLYNOMIALS 


393 


or, multiplying the last equation by x 2 — 1, 

( x 2 — 1) ^ = 2nx(x 2 — l) n 
and, finally, (1 ~ * 2 ) ^ + 2nxy = 0 

If we differentiate this repeatedly with respect to x, we obtain 
(1 — a; 2 )?/' + 2(n — l)a:i/ + (2n)w = 0 
(1 ~ x 2 )v"' + 2 (n — 2)xv" + 2(2 n — l)v' = 0 


and, after 7c + 1 differentiations, 

(1 - zV* +2 > + 2(n - 7c - + (* + l)(2n - k)v™ 

If we now take h — n and put = u, the last equation becomes 
(1 — x 2 )u” — 2 xu' + n(n + l)w = 0 


which is precisely Legendre’s equation. But 


u = v M 


d"( 1 - a 2 )" 
dx n 


0 


is obviously a polynomial of degree n. Moreover, from (9) it is clear that to within 
a constant factor there is only one polynomial solution of Legendre’s equation, 
namely, P„ (x). Hence, P n (x) must’ be some multiple of u; that is, 


P n (x) = c 


d n (l — x 2 ) n 
dx n 


Finally, we can determine c by equating the coefficients of' a;" in the two members 
of the last identity. Clearly, the coefficient of x n on the right-hand side is 

(-1)» c(2n)(2n - 1) • • • [2 n - (n ~ 1)] a (-1)” ^ c 


Moreover, setting k = 0 in Eq. (10), we see that the coefficient of x n in P„(x) is 

(2 »)! 

2 n (n\) 2 


Hence, we must have 


(2n)i _ . . (2n)l 

2"(n!) 2 ” v ' n\ 


or 


c 


C-.D* 

2"n! 


which completes the verification of Rodrigues’ formula. 

Another important identity involving Legendre polynomials 
is embodied in the following theorem: 


= P Q {x) + Pi{x)z + P%{x)z 2 + • • • 

\/l - 2xz + z 2 

+ P n (x)z n + • ‘ • 


THEOREM 2 


394 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


PROOF To prove this, we expand the radical on the left-hand side by the 
binomial theorem, getting 


[1 — z( 2x - 




2 2 2 ! 

(2 n - 3) , 


2 n-i( n - i)! 


+ 


1-3 


(2 n — 1) 


2”n! 


zY + • 
l (2x — z) n ~ 
• n (2x - z) n 4- • 


Now z n can occur only in the terms out to and including the one containing 
z n ( 2x — z) n , and from these, by expanding the various powers of (2x — z), we 
find that its total coefficient is 

• * (2n, — 1) / 0r w 1 ’ 3 • • • (2n — 3) n — 1 / 

{ ) 2 ”~ l (n - 1)! 1! 


1 -3 ■ 


1-3 


(2n - 


2 *~ l (» - 1 )! 

5) (n - 2 )(n - 3) 
2! 


: (2x) n ~ ! 


(2®)— 


x n ~ 


1 2 "" 2 (» — 2 )! 

or, multiplying and dividing by the factors needed to complete the factorials in 
the numerators, 

(2n)! „ (2n — 2)! (2n - 4)! 

2»re!n! 21 2«- 1 l!(» - l)!(w - 2 )! X + 2"- 2 2!(n - 2)l(n - 4)! 

which is precisely the expanded form of P„( x), as given by (10). Thus 

(1 - 2xz + 2 3 )" K * » X Pn(x)z n 

B=0 

In other words, the expression (1 — 2xz + s 2 )~' A is a generating function for the 
Legendre polynomials, analogous to the generating function exp ~ ^t — • for 
the Bessel functions which we investigated in Sec. 9.5. 

In many applications the algebraic form of the Legendre 
polynomials is the more useful. There are problems, however, 
in which it is essential that they be expressed in terms of 0, the 
colatitude angle of the spherical coordinate system with which 
our discussion began. This can easily be done by reversing the 
transformation x = cos 9 which led from the trigonometric to 
the algebraic form of Legendre's equation. However, replacing 
x by cos d in P»(x) leads to expressions which are quite incon- 
venient because of the powers of cos 0 they contain. Fortunately, 
using the generating function provided by Theorem 2, we can 
easily derive more useful forms in which cosines of multiples of 9 
take the place of powers of cos 6. 

To do this, let us substitute 
e ie + e~ ie 


into the generating function, getting 

[1 - z(e i9 + <r i9 ) + 2 2 ]-* = [(1 - ze' 9 )(l - J P n z n 


SEC. 9.8 


LEGENDRE POLYNOMIALS 


395 


1 -3 • 
2-4 


(ID 


Now, if we use the binomial theorem to expand each of the factors 
in the middle term of this continued identity, we obtain 

1 


(1 - ze ie )~' A = 1 + i 


9 + + 
, 1-3 


(2 n - 1) 


and 
(1 - 


(2 n) 


z n e nie _j_ 


i»)-W = 1 + - 


+ 


1-3 
" 2~-~4 Z 
1 -3 • 


2ffl + 

(2 n - 


z n e -niS _|_ 


2-4 - • • (2») 

The coefficient of z n in the product of these two series is easy to 
determine, and we find for it the expression 


(2 n - 1) , 


(2 to ) 


ie + e 
+ 


") + 


1 1-3 


2 2 • 4 • • 
1-3 1 • 3 • • • (2n - 
2 • 4 2 • 4 ■ 


(2 n - 3) 


(2 n - 2) 

5) 




( e (»-2)i0 _J_ 


+ e -C »-'««) + 


(2n — 4) v 

Hence, replacing the various combinations of exponentials by 
their cosine equivalents and recalling that the coefficient of z n in 
the expansion of the generating function is just P„, we have finally 


P„(cos 8) — 


1-3 • • • (2n - 1) . 


2 - 4 
1 1-3 


(2 n) 


2 2-4 • • 

1 -3 13 
2-42-4 ■ 


(2 n - 3) 


( 2 n - 2 ) 


2 cos (n - 2)6 


(2n - 5) , 


(2n - 4) 


2 cos (n — 4)0 -+• 


If n is odd, the final term in P»( cos 8) contains the factor cos 8 
and is correctly given by the last nonzero term in the series (11). 
However, if n is even, the final term in P„( cos 8) is a constant 
which is equal to just half the last nonzero term in the series (11). 
This is the case because, although the general term in the coeffi- 
cient of z n contains both and e~ in ~ 2k)i0 , when n is even 

and k = n/2 these terms are identical and arise only once in the 
product of the series for (1 — ze i0 )~' A and (1 — ze~ ie )~' / - and not 
twice. Thus, specifically, 

Po(cos 8) = 1 
Pi (cos 0) = cos 0 


P 2 ( cos 0) = 


3 cos 20 + 1 
" 4 


P 3 (cos 0) ■— 
P 4(008 0 ) = 
P 5 (cos 0) = 


5 cos 30 + 3 cos 0 
8 

35 cos 40 + 20 cos 20 + 9 
64 

63 cos 50 + 35 cos 30 + 30 cos 0 
128 


396 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


( 12 ) 


(13) 


d[(l ■ 


Since Legendre’s equation can be written in the form 

■*m + n( n+ 1 )„ = 0 


it is elear that it is a special case, with 

p{x) — i q(x) = 0 r{x) -.= 1 — x a X = n(n + 1) 


of the equation covered by Theorem 4, Sec. 8.5. Hence, if solu- 
tions of Legendre’s equation satisfy suitable boundary condi- 
tions, they must be orthogonal. In particular, for the important 
interval (—1,1) no boundary conditions are necessary, since 
r(x ) s= 1 — x 2 vanishes at each end point; that is, 

fix Pm(x)PJx) dx - 0 m 9* n 

Before the property of orthogonality can be used to expand 
an arbitrary function in terms of Legendre polynomials, we must, 
of course, know the value of the integral of the square of the 
general Legendre polynomial. This can be obtained in various 
ways, but perhaps the simplest is to use the generating function 
provided by Theorem 2. If we square the identity 

( T- 2W + **)» " Fa{x) + + ’ ' ' + ' 


and integrate with respect to x from —1 to 1, we obtain 

A r-~sl + v = A + • • ■ + + ■ ■ -Pdx 


The integral on the left is easily evaluated. On the right, all 
integrals involving the product of two different P’s are zero 
because of the orthogonality property (12). Hence, 


- y z In (1 - 2xz + z 2 ) | l _ t = j l _ x P 0 \x) dx 

4- z 2 j l _ x Pi 2 {x) dx + • • • + z 2 « P n *(x) dx 4 
Evaluation of the left member leads at once to 

- I [In (1 - z) 2 - In (1 4 z) 2 ] = \ [In (1 4 z) - In (1 - z)) 

Moreover, if we replace the logarithms by their respective power 
series, we obtain 


- l .(— 


-4; 


2n 2n 4 1 / 

z 2 z 3 ... z 2fl z 2n+1 

2 3" ’ " ’ 2 n 2?T4l 


SEC, 9.8 


LEGENDRE POLYNOMIALS 


397 


Hence, comparing coefficients of z~ n in this series and in the right 
member of (13), we obtain the desired result: 

(14) 

By means of the substitution x = cos 0, Eqs. (12) and (13) 
can be transformed at once into corresponding results for the 
Legendre polynomials in trigonometric form. Hence, we can state 
the following important theorem : 


THEOREM 3 

The Legendre polynomials in algebraic form satisfy the orthogonality relations 
( 0 m n 


Pm(x)P n (x) dx = 


m = n 


\2n + 1 

In trigonometric form, the Legendre polynomials satisfy the orthogonality relations 

[ 0 m n 

fj P m ( cos 0)P»( cos 0) sin 0 dd = • 


2n + 1 


EXAMPLE 1 

The known temperature distribution u = /( 6) is maintained over the entire surface of a sphere 
of radius b. Find the steady-state temperature at any point in the sphere. 

Here we have to solve the steady-state heat equation, i.e., Laplace’s equation, in spherical 
coordinates. However, from the obvious circular symmetry of the problem it is clear that u is 
dH 

a function of r and 0 only. Hence = 0, and Eq. (1) reduces to 

. _ . d 2 u . du . d 2 u 

( 15) ?- 8 sin 0 — - -f 2r sm 9 — + sm 6 — - + cos 0 — =0 

3r a dr ae 2 ae 

Assuming a product solution 

u = R(r)Q(6) 

and substituting into (15), we obtain 

r 2 sin 0 R"Q + 2 r sin 6 R'Q + sin 0 iJ0" + cos 0 RQ' — 0 
From this, by dividing by sin 0 RQ and transposing, we have 

t 2 R'" 2 rR' _ 0" cos 0 0' _ 

R R 0 sin S 0 

Since for any v, the quadratic equation n 2 + n — v == 0 is always satisfied by at least one 
(possibly complex) value of n, it is no specialization to take v = n(n + 1), so that we have the 
two ordinary differential equations 

r 2 R" + 2 rR' — n(n -f- 1)72 = 0 
sin 0 0 " -(- cos 3 0'+ n(n + 1) sin 0 0 = 0 

The first of these is just an instance of Euler’s equation, and its general solution is easily 
found to be 


R = Ar n ~ 


398 


BESSEL FUNCTIONS AND LEGENDRE POLYNOMIALS 


CHAP. 9 


However, since we require solutions which are finite when r = 0, it is clear that we must spe- 
cialize this by taking B ~ 0. The second equation is Legendre’s equation. Since we require 
solutions of it which are finite over the closed interval 0 5? 0 A * and since the only such solu- 
tions are the Legendre polynomials P„(cos 0), it is clear that n must be an integer and 
0 = P n(COS 8) 

Hence, we have the infinite sequence of product solutions 

,4irPi(eos 8) A «r 2 Pi( cos 8) ... >4 B r n P„(cos 8 ) 

None of these by itself can satisfy the given temperature condition 
u(b,8 ) = /(0) 

on the surface of the sphere. Hence, as usual, we form an infinite series of the individual product 
solutions and attempt to make it fit the boundary condition. Thus we write 


u(r,8) = ^ A. n T n P n( cos 8) 


Then, substituting r — b and u(b,Q) — f (6), we get 


f(B) = Y, A n b n P n ( cos 0) 


To find A n we multiply the last equation by sin 0P B (cos 0) and integrate from 0 to ir. By virtue 
of the orthogonality properties of the P’s, all integrals on the right except one become zero, and 
we have 

2n + 1 /V 

or .d„ = — — — / f(8) sin 0 P„(cos 0) d8 

With the coefficients in the series (16) known, the problem is now formally solved. 


1 a Show that any polynomial in x of degree m can be represented uniquely by a finite sum 
of the form 

aoPo(x) -f a\P\(x) +•■'•+ a m Pm(x) 


J ^ x m P n (x) dx = 0 if m < n 

b Express x 2 and x 8 as linear combinations of Legendre polynomials. 

2 Consider the functions x' (i = 0, 1, 2, 3, . . .), and let f n (x) = ^ anix', where the a’s 

t = o 

are coefficients to be determined so that 

{ 0 m t* n 

2 

2 n + 1 

Calculate the a’s for n — 0, 1, 2, 3, and show that, at least for these values of n, f n {x) = 
P«(x). 

S It is desired to approximate a function f{x) over the interval ( — 1,1) by a polynomial P(x) 
of degree n which will make the integral 


/_\ U(x) - P{x)] 2 dx 


SEC. 9.8 


LEGENDRE POLYNOMIALS 


399 


a minimum. Show that P(x) is the nth partial sum of the expansion of f(x) over the interval 
(—1,1) in terms of Legendre polynomials. (The Legendre polynomials thus play the same 
role in the least-square approximation of continuous functions that the orthogonal poly- 
nomials discussed in Sec. 4.6 play in the least-square approximation of tabular functions.) 

4 Show that the Legendre polynomials with even subscripts and the Legendre polynomials 
with odd subscripts both form orthogonal sets over the interval (0,1). 

6 By differentiating the generating function for the Legendre polynomials, show that all 
derivatives of even order of P n (x) vanish at a: = 0 if n is odd, and that all derivatives of odd 
order vanish at a; = 0 if n is even. What are the values of the nonzero derivatives at x = 0? 

6 Using Rodrigues’ formula, prove that 

P' n+ i(») ~ K-dx) = (2 n + l)P«(a) 

Hence show that 

n „ , , , Pn-l(x) - P„ +1 ( X) 

], FMdx ~ i+! 

7 Find the steady-state temperature at any point in a spherical shell of inner radius bi and 
outer radius b 2 if the temperature distributions u(bi,0) =/i(0) and u(b 2 ,0) — f 2 (0) are 
maintained over the inner and outer surfaces, respectively. 

8 The temperature distribution u(b,0) = f(6) is maintained over the curved surface of a 
hemisphere of radius b. The plane boundary of the hemisphere is kept at the temperature 
u =* 0. Find the steady-state temperature at any point in the hemisphere. (Hint: The 
results of Exercise 4 may be of assistance.) 

9 Find polynomial solutions of the equation 

y" — 2xy + 2 ny = 0 n an integer 

and show that they are orthogonal with respect to the weight function e~ xt over the interval 
(— 00 , 00 ). [This equation is known as Hermite’s equation, after the French mathema- 
tician Charles Hermite (1822-1901), and its polynomial solutions are known as Hermite 
polynomials.] 

10 Find polynomial solutions of the equation 

xy" + (1 — x)y' 4- ny = 0 n an integer 

and show that they are orthogonal with respect to the weight function e~* over the interval 
(0, »). [This equation is known as Laguerre’s equation, after the French mathematician 
Edmond Laguerre (1834-1886), and its polynomial solutions are known as Laguerre 
polynomials.] 


CHAPTER TEN 


Determinants 

and 

Matrices 

10.1 

Determinants 

In a restricted sense, at least, the concept of a determinant is 
already familiar from elementary algebra, where, in solving sys- 
tems of two and three simultaneous linear equations, we found 
it convenient to introduce what we called determinants of the 
second and third order. In the work of this book we shall have 
occasion to generalize these ideas to the solution of systems of 
more than three linear equations and to other applications not 
immediately associated with solving equations. For this reason 
we shall devote this and the following chapter to a review and an 
extension of our earlier study of determinants and to a discussion 
of some of the fundamental properties of the related mathematical 
objects known as matrices. 

By a determinant of order n we mean a certain function of 
n 2 quantities, which we shall describe more precisely as soon as we 
have introduced the necessary notation and preliminary defini- 
tions. The customary symbol for a determinant consists of a 
square array of the n 2 quantities enclosed between vertical bars: 




au 

Ol2 

‘ * a\„ 

(1) 

Mlt = \aij[ - 

an 

®22 ' 

‘ ’ U2n 



a n i 

a n 2 

a n n 


For brevity, we shall often use the word determinant to refer to 
this symbol as well as to the expansion j: for which it stands. 


t The use of vertical bars in the notation for a determinant and in the nota- 
tion for the absolute value of a quantity, while perhaps unfortunate, is 
universal. Which meaning is intended in any particular case should always 
be clear from the context. 

J See Definition 1, this section, p. 403. 


SEC. 10.1 


DETERMINANTS 


401 


Although logically undesirable, this dual usage is quite common 
and should cause no confusion. 

The quantities a a which appear in (1) are called the elements 
of the determinant. The horizontal lines of elements are called 
rows; the vertical lines of elements are called columns. In the 
convenient double subscript notation illustrated in (1), the first 
subscript associated with an element identifies the row and the 
second subscript identifies the column in which the element lies. 
There is, of course, no reason to suppose that the element in the 
zth row and jth column is the same as the element in the jth row 
and ith column, and so in general a# 5* a The sloping line of 
elements extending from an to a nn is called the principal diagonal 
of the determinant. 

The determinant \M\ formed by the ra 2 elements common to 
any m rows and any to columns of an nth-order determinant jA| 
is said to be an ?nth -order minor of \A\. The determinant of 
order n — m formed by the elements which remain when the m 
rows and m columns containing an mth-order minor \M\ are 
deleted from |A| is called the complementary minor of \M\. If 
the numbers of the rows and columns of |A| which contain an 
mth-order minor \M\ are, respectively, 

ii,U, . . . ,i m and 31,32, : . . ,j m 

then ( — times the complementary minor 
of \M\ is called the algebraic complement of |M|. The first-order 
minors of |A| are, of course, just the elements of \A\. Their 
complementary minors are customarily referred to simply as 
minors, and their algebraic complements are almost universally 
referred to as cofactors. We shall denote the minor of the element 
di, by the symbol Mu and its cofactor by the symbol thus, 

An = (—l) i+i Mij 

Similarly, we shall use the symbols My, a and A^-.h to denote, 
respectively, the complementary minor and the algebraic com- 
plement of the second-order minor contained in the tth and jth 
rows and the fcth and Zth columns of a determinant | A | ; thus, 

A ijlk i = (-1 

The generalization of this notation is obvious. 
example 1 

In the fifth-order determinant 

an an an au an 

an an an au an 

an an an au an 

an an an au a« 

an an an au ass 


\A\ =■ 


402 


DETERMINANTS AND MATRICES 


CHAP. 10 


the minor of the element a i3 is the fourth-order determinant formed by the elements which 
remain when the fourth row and third column are deleted from |A|, namely, 


M a 


an fij.2 a u oi6 
022 fl24 025 
031 032 034 035 

051 052 054 055 


The cofactor A i3 of the element a i3 is equal to this minor times ( — l) 4+3 j he., 


A 43 = ~M 43 

Similarly, the complementary minor of the second-order minor 


031 034 

I 051 051 I 

contained in the third and fifth rows and the first and fourth columns of \A \ is the third-order 
determinant formed by the elements which remain when these rows and columns are deleted 
from \A j: 


Mu.1 4 


012 013 013 

022 023 «25 

042 043 045 


The algebraic complement An,u of the given second-order minor is equal to the complementary 
minor M 35.14 times ( — i) 3 + 5 +i+ 4 . j >e>j 


( 2 ) 


(3) 


d 35,14 - -Mn, U 

For a second-order determinant we have the definition 

Oil ®12 

' I 011022 ” 012021 

I 021 022 I 

that is, a second-order determinant is equal to the difference be- 
tween the product of the elements on the principal diagonal and 
the product of the elements on the other diagonal. For a third- 
order determinant we have the definition 


011 012 013 

021 022 023 — 011022033 + 012023031 + 013021032 

031 032 033 — 0130220 31 — 011023032 — 0120 21033 


This expansion can also be obtained by diagonal multiplication, 
by repeating on the right the first two columns of the determinant 
and then adding the signed products of the elements on the various 
diagonals in the resulting array: 


(+)^ 

(-r 


(+ v (+ ^ 

011^ 012-^^-013^1 

021^^022 ^023v^J 

031^ -032 '^033'^" 

<-f (-T. 


.012 

022 

'032 


The diagonal method of writing out a determinant is correct only 
for determinants of the second and third orders, however, and will 
in general give incorrect results if applied to determinants of 
higher order. 




SEC. 10.1 


DETERMINANTS 


403 


We are now in a position to give the general definition of a 
determinant. This can be done in direct fashion, but the result is 
unsuited to the practical evaluation of determinants; hence, we 
choose to give an inductive definition: 

DEFINITION 1 
The determinant 



Oil 

O12 ‘ 

' Oin 

w - 

021 

O22 " 

* fl2n 


On! 

O n 2 

* • o„„ 


is equal to the sum of the products of the elements of any row or column and their 
respective eofactors; i.e., 

m = X aijAij = X a * Ai i 

1 i = l 

Clearly, this definition makes a determinant of order n depend 
upon n determinants of order n — 1, each of which in turn 
depends upon n — 1 determinants of order n — 2, and so on, 
until finally the expansion involves Only second- or third-order 
determinants which can be written out by the diagonal method. 
However, before Definition 1 can be accepted and used, it must be 
shown that the same expansion is obtained no matter which row or 
which column is selected. That this is the case is guaranteed by 
the following theorem: 


THEOREM 1 

If the elements of any row or of any column of a determinant are multiplied by 
their respective cofactors and then added, the sum is the same for all rows and for 
all columns. 

PROOF We shall first prove that the same expansion is obtained no matter 
which row is chosen. To do this we proceed inductively. Clearly, the theorem is 
true when n = 2 ; for, expanding the determinant 

All Gi2 
021 O22 

in terms of the elements of the first row and their cofactors, we get 
011 ( 022 ) + °n(~ a n) 

and, expanding in terms of the elements of the second row and their cofactors we get 

02i( — 012) + 022(011) 

and these two expressions are identical. Let us assume, then, that the assertion of 
the theorem is true for determinants of order n — 1, and let us attempt to prove 
that it is true for determinants of order n. Specifically, let us expand the nth-order 
determinant 


\A\ = M 


404 


DETERMINANTS AND MATRICES 


CHAP. TO 


in terns of each of two arbitrary rows, say the ith and the jth, and compare the 
expansions. In doing this it is, of course, no specialization to assume i < j. 


(4a) 


(4b) 


(tot) ■ 


&lk 

[to) 


k < l 


k > l 


Now, a typical term in the expansion of |A| according to the elements of the 
ith row is 


(5) a ik A ik = (-1 y+ k a ik M ik 

and this contains the only occurrences of to in the entire expansion. Moreover, the 
(j — l)st row of ikfft contains n — 1 elements from the jth row of |A|, and Mu can 
legitimately be expanded in terms of these elements, since the hypothesis of our 
induction is that the theorem in question is true for determinants of order n — 1. 
As the typical term in the expansion of Mik according to the elements from the jth 
row of | A |, we therefore have 

(6) to • (cofactor of an in Mu) l j* k 

and this contains the only occurrences of to in the expansion of M ik . Hence, sub- 
stituting the expression (6) into Eq. (5), we find that the expression 

(7) (- l) i+ *to • [to ‘ (cofactor of in Mu)\ l ^ k 


SEC. 1 0.1 


DETERMINANTS 


405 


contains the only occurrences of the product a ik aji in the expansion of ] A j in terms 
of the elements of the fth row. In exactly the same way, if we first expand |Aj in 
terms of the elements of the jth row and then expand the minor M# in terms of the 
n — 1 elements from the ith row of |A| which it contains, we conclude that the 
only occurrences of the product a ik aji in the expansion of |A| in terms of the ele- 
ments of the jth row are contained in the expression 

(8) ( - 1 Y +l aji • [a ii5 • (cofactor of a ik in Mji)] k ^ l 

If we can show that (7) and (8) are identical, we will have completed our proof 
that under the induction hypothesis all row expansions of |A| are the same, since 
a-ikdji (7c 7) is the typical product of an element from the ith row and an element 

from the jth row, and each term in the expansion of |A| according to the ith row 
or the jth row must contain one and only one such product. 

Now, except for the proper power of (—1), both the cofactor of aji in M ik 
and the cofactor of aik in Mji are equal to the determinant of order n — 2, say 
Mum, formed by the elements which remain when the fth and jth rows and the 
fcth and 7th columns are deleted from \A\. There are two possibilities to consider, 
according as 7c < 7 (4a) or k > 7 (4 6) ; i.e., according as the 7cth column precedes 
or follows the 7th column in the determinant |A|. Taking due account of the 
relative positions of the deleted rows and columns, the proper signs are easily 
determined by inspection, however, and we have, respectively, in the two cases: 

Cofactor of a „ in M ik = (-l)tf-i>+ff-i>M« tW 
Cofactor of a* in Mji — (— l) i+k Mu,ki 
Cofactor of aji in Mu, — ( — 1 ) u ~ l)+l Ma,ki | ^ ^ 

Cofactor of a ik in M st — ( — | 

Finally, substituting these expi’essions into (7) and (8), we find that the coeffi- 
cient of the product aikaji, as determined by either method of expansion, is 

(9a) (_i y+i+WMijM k < 7 

(96) (-l) i+ i +k+l - l MijM = -(-iy + ’' +k+l M i3 M k>l 

In exactly the same way, if we expand |A| in terms of two arbitrary columns, 
say the Tbth and the 7th, we find that the coefficient of the general product aikaji 
is still given by (9a) and (96). This proves that, under the induction hypothesis, 

not only are all column expansions of J A | equal but their common value is the 

common value of the row expansions of |A|. Thus we have completed our proof 
that if the theorem is true for determinants of order n — 1, then it is true for 
determinants of order n. Since we have already proved it true for row expansions 
of second-order determinants and could similarly prove it true for column expan- 
sions, our induction is complete; Theorem 1 is established; and Definition 1 is 
unambiguous. 

Since the same expression is obtained whether we expand a 
determinant in terms of the elements of an arbitrary row or an 
arbitrary column, we have the following obvious consequence 
of Theorem 1 : 


■ k < l 



406 DETERMINANTS AND MATRICES CHAP. 10 


THEOREM 2 

If \A | is any determinant and if \B\ is the determinant whose rows are the columns 
of \A\, then \A\ = \B\. 

The proof of Theorem 1 also provides us with the proof of 
the following important result: 


THEOREM 3 

Let any two rows (or columns) be selected from a determinant |A|. Then \A\ is 
equal to the sum of the products of all the second-order minors contained in 
the chosen pair of rows (or columns) each multiplied by its algebraic complement. 

PROOF Let the chosen rows be the pth and the qth, and, for definiteness, 
suppose that p < q. Now a typical second-order minor from these rows is 

(10) apr flp ’ - a pr a 9a — a PS a gr r < s 

j (Ijr | 

and to prove the assertion of the theorem it is sufficient to show that the coeffi- 
cient of this binomial in the expansion of |A| is 
A pq s„ - (— 1 )*+ q+T+, M Pg ,„ 

To do this, we observe first that, from Eq. (9a), with i — p, j ~ q, k = r, and 
l = s, the coefficient of the product a Pr a qa in the expansion of |A| is 

(~l)v+*+r+‘M pq , r , 

Also, taking i — p, j = q, k — s, and l — r in Eq. (96), we find that the coefficient 
of a P ,a qT in the expansion of |A| is 

_(_l)p+ 9 +r ^M pq ,r. 

Hence, the expansion of |A| contains the terms 

a PT a 38 [(~l) p+ « +r+s M P3 , r ,] -f a p ,a flr [-(-l ) p+!+r+ 'M M , rs ] 

and these are the only occurrences of the products a pr a q , and a pa a qr . Finally, from 
these, by factoring, we obtain 

(dprdqt ap 8 agr)[(— p3l ra] “ (o pr a 3S dpudqr) A p q :Te 

which completes the proof of the theorem in the case where two rows are used for 
the expansion. An essentially identical argument establishes the assertion of the 
theorem when two columns are used. 

By a somewhat more involved argument the following gen- 
eralization of Theorem 3 can be established : 

THEOREM 4 

Let any m rows (or columns) be selected from a determinant |Aj. Then j/lj is 
equal to the sum of the products of all the mth-order minors contained in the 
chosen rows (or columns) each multiplied by its algebraic complement. 

Both the general result contained in Theorem 4 and the special 
case m — 2 contained in Theorem 3 are usually referred to as 

Laplace’s expansion. 


I 



SEC. 10.1 


DETERMINANTS 


407 


EXAMPLE 2 

Expand the determinant 

]1 

HI = 


2 

4 3 

0 -1 


For purposes of illustration we shall obtain the value of this determinant using Definition 1 
and also using Theorem 3. According to Definition 1, using the third row because of the presence 
of the zero element, we have 


1 2 3 4! 

(0) 3 2 1 i 

-(-i) 

13 4 

4 2 1 

|l 2 4’ 

+ (2) 4 3 1 

-(3) 

12 3 

4 3 2 

|e 4 -2 


1 4 -2 

| 1 6-2 


16 4 


or, expanding the third-order determinants by the diagonal method, 


|A| = 0 + 75 4- 180 - 105 = 150 

Equivalently, applying Theorem 3 in terms of the first two rows, we have 


HI ■ 


-1 3 

6 -2 1 
I 2 3 ||0 
I 3 2 ||l 


3 4 | | 0 -1 


• (-5X-16) - ( — 10)( — 16) +(— 15)( — 16) + (— 5)( — 3) - ( -10)( -2) + ( -5)(1) 
- 150 


as before. 

Using Theorems 1 and 3, a number of other theorems can 
easily be proved. In particular, we have the following useful 
results: 


THEOREM 5 

If all the elements in any row or in any column of a determinant are zero, the 
value of the determinant is zero. 

PROOF If we expand the given determinant, according to Definition 1, in 
terms of the row or column of zero elements, each term in the expansion contains 
a zero factor. Hence, the entire expansion is zero, as asserted. 

THEOREM 6 

If each element in one row or in one column of a determinant is multiplied by c, 
the determinant is multiplied by c. 

PROOF If we expand the given determinant in terms of the row or column 
whose elements have been multiplied by c, each term in the expansion contains 
c as a factor. If c is then factored from the expansion, the result is just c times the 
expansion of the original determinant, as asserted. 

THEOREM 7 

If m is any determinant and if |I?| is the determinant obtained from \A\ by 
interchanging any two rows or any two columns of |H|, then \B\ = - \A\. 


408 


determinants and matrices 


CHAP. 10 


PROOF Let |.4( be any determinant, and let (2?| be the determinant obtained 
from f.A| by interchanging any two rows (or any two columns) of \A\. Now, 
clearly, if the rows (or the columns) of any second-order determinant are inter- 
changed, the resulting determinant is the negative of the original one. Hence, 
if \B j is expanded in terms of the two rows (or columns) which were interchanged, 
it follows that each second-order minor occurring as a factor of a term in this 
expansion is the negative of the corresponding second-order minor from the 
corresponding pair of rows (or columns) in |A|. Therefore, each term in the 
expansion of \B\ is the negative of the corresponding term in the expansion of 
|ri| based on the same two rows (or columns). Thus \B\ — — \A\, as asserted. 


THEOREM 8 

If corresponding elements of two rows or of two columns of a determinant are 
proportional, the value of the determinant is zero. 

PROOF Clearly, any determinant of the second order whose rows or columns 
are proportional is zero. Hence if we expand the given determinant, according to 
Theorem 3, in terms of the two rows or two columns which are proportional, it 
follows that each term contains as one factor a second-order minor equal to zero. 
Therefore, the entire expansion is zero, as asserted. 

THEOREM 9 

If the elements in one column of a determinant are expressed as binomials, the 
determinant can be written as the sum of two determinants, according to the 
formula 


On • 

■ • ay -f- a y • ' 

' ’ Oln 


Oil 

• • ay • 

• ' Oin 

®21 ' ' 

■ • ay + ay • • 

®2» 

= 

$21 

• • ay • 

■ ■ a%n 

OnX 

‘ • Qnj + OC n j ‘ ' 

' ' On n 


0>nl 

dnj 

■ ' 0„„ 


A similar result holds for a determinant containing a row of elements which are 
binomials. 

PROOF If we expand the given determinant, according to Definition 1, in 
terms of the column which contains the binomial elements, we obtain 

X (o»j + m,)An - T aijAij -h V aijAij 
l 

Since the sums on the right are, respectively, the expansions for the determi- 
nants appearing on the right side of the formula in the theorem, the theorem is 
established. 

THEOREM 10 

The value of a determinant is unchanged if the elements of any row (or column) 
are modified by adding to them the same multiple of the corresponding elements 
in any other row (or column). 


SEC. TO.T 


DETERMINANTS 


409 


PROOF If we apply Theorem 9 to the determinant resulting from the given 
row (or column) modification, we obtain two determinants, one of which is the 
original determinant and the other of which contains two proportional row's (or 
columns). By Theorem 8, the second determinant is equal to zero, and the 
theorem is established. 

Theorem 10 is very useful in the practical expansion of de- 
terminants, for, by its repeated application, one can reduce to 
zero a number of the elements in a chosen row (or column) of 
the given determinant. Then, when the determinant is expanded 
in terms of this row (or column) most of the products involved 
will be zero and the computation will be appreciably shortened. 

EXAMPLE 3 

Find the value of the determinant 

3 1-12 1 

0 3 1 4 2 

1 4 2 3 1 

5-1-3 2 5 

-11232 

Here, in an attempt to introduce as many zeros as possible into some row, let us add the 
third column to the second and to the fifth, arid let us add twice the third column to the fourth 
and 3 times the third column to the first. This gives the new but equal determinant 

0 0-10 0 

3 4 16 3 

7 6 2 7 3 

— 4 -4 -3 -4 2 

5 3 2 7 4 


Expanding this in terms of the first row, according to Definition 1, we have 


( _l) ( _l)i + » 


3 4 

7 6 

-4 —4 

5 3 


Now, adding twice the last column to each of the first three, we obtain the equal determinant 


9 10 12 3 
13 12 13 3 
0 0 0 2 
13 11 15 4 


or, expanding in terms of the third row, 

I 9 10 12 
— (2)(-l) s +< 13 12 13 
1 13 11 15 

We can now simplify this by further row or column manipulations, or, since it is of the third 
order, we can expand it by the diagonal method. The result is — 166. 

THEOREM 11 

The sum of the products formed by multiplying the elements of one row (or 
column) of a determinant by the cofactors of the corresponding elements of 
another row (or column) is zero. 


DETERMINANTS AND MATRICES 


PROOF Let \A\ - | Oy-| be the given determinant, and let the elements of 
some row of \A |, say the fth, be multiplied by the cofaetors of the corresponding 
elements in some other row, say the jth, giving the sum 

V dikAjk 

A* 1 

Clearly, according to Definition 1, this can be thought of as the expansion of a 
determinant whose yth row consists of the elements 

dil ' ' di n 

and whose other rows are identical with the corresponding rows in \A\. In this 
new determinant the fth and jt.h rows are therefore the same, and, hence, by 
Theorem 8, the determinant is equal to zero. A similar argument leads to the 
same conclusion if the elements in some column of \A\ are multiplied by the 
cofacfors of the corresponding elements in some other column of \A\. Thus the 
theorem is established. 

Combining Definition 1 and Theorem 11, we have the follow- 
ing useful result: 


COROLLARY 1 

If Aij is the cofactor of the element a;, in the determinant \A\ = |a#|, then 

i,:, tj , - i— ii ::: 


X a rtAjk 


v of the determinant 


EXAMPLE 4 

If we take the elements of the first r 

I On an an 

flSl 022 023 

032 O32 033 

and multiply them by the cofactors of the corresponding elements in the third row, say, we 
obtain the sum 

®22 O13 

an . . — «22 1 _ I 

1 «22 0-23 1 I 021 023 I 

and this is clearly the expansion of the determinant 

an an an 
«22 «22 an 
an an an 

according to the third row. Since this determinant has two identical rows, it vanishes identically. 

Results such as those of Corollary 1 are often stated more 
compactly in terms of what is known as the Kronecker delta,* 
usually written 5,y, or sometimes 5,*, and defined to be 0 or 1 
according as i 9^ j or i = j. Using the Kronecker delta, the 


Named for the German mathematician Leopold Kronecker (1823-1891). 


SEC. 10.1 


DETERMINANTS 


411 


assertions of Corollary 1 can be written in the simpler form 

( 11 ) £ a a A n = \A\i it 

( 12 ) ^ (XikAn = 


THEOREM 1 2 

The product of two determinants of the same order is a determinant of the same 
order in which the element in the ith row and jth column is the sum of the prod- 
ucts of corresponding elements in the fth row of the first determinant and the jth 
column of the second determinant. 

PROOF For simplicity we shall prove this theorem only for determinants of 
the second order, although for these direct verification is easier and more natural 
than the method we shall actually use. The virtue of our proof is that it can be 
extended immediately to the general case of determinants of any order. We begin 
by observing that if 


14 « 

then, by Theorem 3 , 


dll U12 

«21 a 22 


and 


14 - 


b 12 
£>22 


14 • 14 ■ 


&n 0x2 
021 022 


&11 £>12 
£>21 £>22 


Oil Ol2 
021 O22 

Cl! C12 
C21 C22 


0 

0 

bn 

£>21 


where cn, C12, c 2 i, and C22 are completely arbitrary. In particular, it is convenient 
to take Cu = c 22 = —1 and C12 = c 2 i = 0, so that we have 


14 • 14 ■ 


Oil O12 
021 022 
-1 0 
0 -1 


Now if we multiply the elements of the first column by bn and the elements of 
the second column by £> 2 i and then add them to the corresponding elements of 
the third column, we obtain, by Theorem 10 , the equal determinant 


14 ■ 14 : 


On Oi 2 oii£>n -b O12&21 0 

U21 o 22 021611 -b O22&21 0 

-10 0 612 

0-10 622 


In the same way, if we multiply the elements of the first column by bn and the 
elements of the second column by 622 and then add them to the corresponding 
elements of the fourth column, we obtain from the last determinant the equal 
determinant 

On O12 O11611 + O12621 O11&12 + O12622 

0 2 i O22 O2I&U "j" O22621 O21612 -j- 022622 j 

-10 0 0 

0 -1 0 0 


\M ■ |B| ' 


412 


DETERMINANTS AND MATRICES 


CHAP. 10 


If we now expand this determinant by Theorem 3, applied to the last two rows, 
we obtain 

<*n6n + <112621 <*11612 + <*12622 

<*21611 -f- <*22621 <*21612 + <*22622 

•which is the result asserted by the theorem. 


\A\-\B\ = 


EXERCISES 

1 Find the value of each of the following determinants: 


a 1 2 3 4i 

2 14 3 

3 4 2 1 

4 3 1 2 


b 12 3 4 
4 3 2 1 

2 1 4 3 

3 4 1 2 



2 a Find the value (s) of a, if any, for which the diagonal method of expansion yields the 
correct value for the determinant 



b Show that there is no value of a for which the diagonal method of expansion yields the 
correct value for the determinant 


12 11 
-1 13 2 

-1 3 9 1 
2 11a 


3 Show that the number of terms in the expansion of a general determinant of order n is n! 

4 If- 14'| = |<*oi is a determinant of order n with the property that an - an for all values of i 
and j such that 1 < i,j < n, prove that |4| = ( — 1)" |4|. What further conclusion can be 
drawn if n is odd? (Hint: Use Theorems 2 and 6.) 

5 Prove that 


1 + Ol 

02 

03 ... 

a n 

ai 

1 +o 2 

03 ... 

On 

ai 

a-i 

1 + os • * • 

a„ 

at 

02 


1 +a„ 


6 If D„ is the nth-order determinant 


1 + X s 

X 

0 

. . . 0 

X 

1 +s 2 

X 

... 0 

0 

X 

1 +X S 

... 0 

0 

0 

0 

... 1 + 

0 

0 

0 

X 




0 

0 

0 


x 

1 + x 2 


show that D n = (1 + a4)D„-.i — x 2 D»_j. Using this relation, determine the value of Dio if 
x = 1; if X = -1. Is the value of D» independent of x? 

7 If D n is the nth-order determinant in which each element on the principal diagonal is a, 
each element immediately above the principal diagonal is 6, each element immediately 


SEC. 10.1 


DETERMINANTS 


413 


below the principal diagonal is c, and all other elements are zero, obtain a recurrence rela- 
tion expressing D n in terms of D„_i and Dn- 2. Use this relation to infer the value of D n if 
a = 3, 6 = 2, e = 1. 

Show that the rath-order determinant 

6 
6 


is equal to (a — 6) n-1 [a + (n — 1)6]. 
Show that 


fll a 3 <2*1 

a 1 2 (Z2 2 <x 3 2 (X4 2 

&i 3 d*P &3 8 Ck 3 


- (ai — ai)(ai — a 3 )(a 1 — a 4 )(a 2 — a 3 )(a 2 — at)(a 3 - aj 


What is the generalization of this result to determinants of order ra? 

10 If pi, pa, . . . , p„ are polynomials and if an, x«, . . . , x n are variables, show that the rath- 
order determinant 



Pl(Sl) 

pi(Xt) • • 

pi(x„) 


Pi(Xl) 

P^x 2) • • 

Pi(x n ) 


Po(Xi) 

P»(*s) • • 

• Pn(Xn) 

is divisible by 

n 

(Xi - Xj). 



l<i<j<n 

11 Prove the following generalization of Theorem 12: The product of two determinants of the 
same order is another determinant of the same order in which the element in the ith row and 
jth column is the sum of the products of corresponding elements in the ith row or column of 
the first determinant and the jth row or column of the second determinant, a consistent 
choice of row or column being maintained for all values of i and j. (Hint: Use Theorem 2.) 


6 2 -f- a 

12 a Show that |.A | = | ab 
^Hint: Verify first that |/1| 


be c 2 

ac be 

ab 6 2 + ac 
b c 0 
a 0 c 
0 a 6 


b Find the value of the determinant 


13 If | A | is the rath-order determinant 


an <zj2 
an a 22 


2 + a 2 6c 
6c a. 2 + 6* 



on 

an • 

■ • am 

Xi 


flai 

a 22 

• • am 

Xi 

show that 

a ni 

a»s • 

■ • a nn 

X n 


Vi 

yt ■ 

■ • y* 

0 


^ AijXtlJj 


where An is the cofactor of a,-, in |d.|. 


414 


DETERMINANTS AND MATRICES 


CHAP. TO 


14 Show that the area of the triangle whose vertices are the points (ah.yO, (£2,1/2), (£3,1/3) is 
given by the formula 

j £i y v l| 

A ~ ± ~ £2 1/2 

£3 Vz 

where the plus or the minus sign is chosen according as the vertices of the triangle are 
numbered consecutively in the counterclockwise or the clockwise direction. 

16 If Pi: (£1,1/1), Pi' (£2,1/2), and P 3 : (£3,1/3) are three points, no two of which lie on the same 
vertical line, show that the equation of the parabola of the family y = a + bx + cx 1 which 
passes through Pi, P2, and P 3 can be written in the form 
1 £ £ s | 

1 t > - ° 

1/5 1 £2 £2* 

y s 1 £3 £3’ 

Is this result correct if Pi, Pi, P 3 are collinear? 

Show that the equation of the circle which passes through the three points Pr. (£1,1/1), 
Pt- (£2,1/2), and P 3 : (£3,1/3) can be written in the form 
x\ y 1 x y 1 1 

£i l j/i 2 £i Vi 1 


£2 j y 2 a 


£2 Vi 1 

£3 y 3 I 


Is this result correct if the three points are collinear? 

dn£ + ciny + On — 0 

02l£ + 022 V + 023 = 0 
O31X + a 3 iV + 033 = 0 

two of which are parallel, show that k, k, and l 3 are concurrent if and only if 


ur- 


Ou 012 O13 

O21 O22 ®23 


031 032 O33 


= 0 


b Show that the area of the triangle determined by the lines U, h, and l 3 is equal to the 
absolute value of the expression 

- I An An An I 


A u A nA 3; 


where An is the cofactor of an in |A|. 

, d\A\ _ 


18 If |A| = jo,-j|,.show that 


dan 


= An- 


19 If each element of a determinant \A\ of order n is a differentiable function of t, show that 
the derivative of |A| with respect to t is equal to the sum of n determinants, the ith one of 
which is identical with | A ] except for the ith row which consists of the derivatives of the 
elements of the ith row of |A|. (Hint: Proceed inductively, as in the proof of Theorem 1.) 

20 If/1,/2, . . . , f n are suitably differentiable functions of t, show that 



fi ' 

• fn 

d 

fi ■ 

" fn 

dt 

/,(*-» • 

. . /„<»“*) 


■ 

. . /„(»-« 


fl / 

• • fn 

f\ • 

' ' f'n 

/!<»-*) . 

• • /»<•-» 

fl™ ■ 

• • /.« 


(Hint: Use the result of Exercise 19.) 


SEC. 10.2 


ELEMENTARY PROPERTIES OF MATRICES 


415 


10.2 

Elementary properties ©f matrices 

Closely associated with, determinants, yet significantly different 
and much more fundamental, are the mathematical objects 
known as matrices : 

DEFINITION 1 

An m X ft or (m,n) matrix is a rectangular array of quantities arranged in m 
rows and n columns. 

When there is no possibility of confusion, matrices are often 
represented by single capital letters. More commonly, however, 
they are represented by displaying some or all of the constituent 
quantities between double vertical bars;* thus, 


1 

«U 

an 

■ • a In 

II 

J? 

®21 

an 

®2n 

1 

Urol 

a m 2 

a mn 


Two matrices A — ||a tJ -|| and B = (|bqj| are equal if and only if 
they are identical; that is, if and only if they contain the same 
number of rows and the same number of columns, and a i} = bn 
for all values of i and j. 

A matrix consisting of a single column is called a column 
matrix. A matrix consisting of a single row is called a row matrix. 
Both column matrices and row matrices are often referred to as 
vectors. Th e (n,m) matrix obtai ned from a given (m,n) matrix 
A by interchanging its rows and colum ns is c alled the transp ose 
ofX3Ve shall denote the transpose of a matrix A by the symbol 
A?, although some writers use the symbol A' or the symbol A. 
A matrix with the same number of rows and columns is called a 
square matrix. The determinant whose array of elements is 
identical with the array of elements of a square matrix is called the 
determinant of the matrix. A square matrix in which every 
element below the principal diagonal is zero is said to be upper 
triangular. A square matrix in which every element above the 
principal diagonal is zero is said to be lower triangular. A square 
matrix in which each element not on the principal diagonal is 
zero is called a diagonal matrix. Diagonal matrices are sometimes 
denoted by the symbol 



* Some writers use square brackets or parentheses instead of double bars. 


4!6 


DETERMINANTS AND MATRICES 


CHAP. 10 


A diagonal matrix in which each diagonal element is 1 is called a 
unit matrix. A unit matrix is usually denoted by the symbol I, 
*or, more specifically, by the symbol I n if it is a unit matrix of 
order n. A matrix in which every element is zero is called a 
null matrix or a zero matrix, and is denoted by the symbol 0. 

If A is an (m,n) matrix and if k and l are integers such that 
0 < k S m and 0 < l S «, then the array of elements common 
to any k rows and any Z columns of A is called a (k,l) submatrix 
of A. If A is a square matrix, any square sub matrix of A whose 
principal diagonal is a part of the principal diagonal of A is called 
a principal submatrix of A. A principal submatrix of a square 
matrix A is thus a submatrix of rows and columns with the same 
indices. The determinants of the square submatrices of any matrix 
A are called the minors of A. The determinant of any principal 
submatrix of a square matrix A is called a principal minor of A . 

Most matrices which occur in elementary applications have 
the property that all their elements are real. However, there are 
important applications, especially in mathematical physics and 
quantum mechanics, which involve matrices whose elements are 
not real. For this reason we shall introduce certain definitions 
and later on state certain fundamental theorems in a form appro- 
priate to matrices whose elements may be general complex 
quantities. 

Recalling that the conjugate of a complex number z — x + iy 
is the complex number z — x — iy, we say that the conjugate 
of a matrix A is the matrix A whose elements are, respectively, 
the conjugates of the elements of A. Clearly, a matrix is real, 
i.e., contains only real elements, if and only if A and A are the 
same matrix. Similarly, a matrix A is imaginary, i.e., contains 
only elements which are pure imaginaries, if and only if A = — A. 
The transpose of the conjugate of a matrix A is called the 
associate of A. 

A matrix equal to its transpose, i.e., a square matrix such 
tha t a# = ag for 1 ^ i,j S n, is said to be sym metric . A matrix 
equal to the negative of its transpose, i.e., a square matrix such 
that an — —aj i and in which, therefore, an — 0, is said to be 
skew-symmetric. A matr ix equal to its associate, i.e., a matrix A 
such t hat A = A 7 ', is said to be hermitian.* A matrix A such that 
A = A^is sai^tp'be skew-hermitian. Clearly, a real symmetric 
matrix is just a hermitian matrix which is real, and a real skew- 
symmetric matrix is just a skew-hermitian matrix which is real. 
Thus, in particular, any result true for hermitian matrices is 
automatically true for real symmetric matrices. For this reason, 
although real symmetric matrices are of fundamental importance 
in most of the applications of matrices in this book, we shall as 
far as possible state our theorems in terms of hermitian matrices. 


* Named for the French mathematician Charles Hermite (1822-1905). 


SEC. 10.2 


ELEMENTARY PROPERTIES OF MATRICES 


417 


The concept of a matrix is essentially simpler than the con- 
cept of a determinant. For, whereas a matrix is just a collection 
of elements arranged in a particular way, a determinant is a 
rather complicated function of the elements in a given set. In 
particular, with every square matrix there is associated a deter- 
minant, namely, the determinant whose elements are respectively 
equal to the elements of the matrix. Thus determinants of order 
n and n Xn matrices bear to each other the familiar relation, 
of dependent and independent variable, respectively; and it is 
appropriate to speak of a determinant as a function of a square 
matrix. 

Examples of matrices can be found in many fields. For 
instance, if each of m students were given a battery of n different 
tests, the resulting scores would very probably be displayed in a 
table containing m rows — one for each student— and n columns — 
one for each test. The resulting array would, of course, be an (m,n) 
matrix in which the general element atj was the score which the 
ith student made on the jth test. Matrices of this sort are of 
fundamental importance in the branch of mathematical psy- 
chology known as factor analysis. Similarly, if we had an electrical 
network containing n branches, we might, either experimentally 
or analytically, determine the current which would flow in the 
ith branch as a result of inserting a unit voltage in the y th branch. 
A tabular array of these quantities would also constitute a 
matrix— the so-called admittance matrix which is of fundamental 
importance in the theory of electrical circuits. Still another 
example of a matrix is provided by an array of transition 
probabilities : Suppose that a system S can exist in any one of n 
states, say Si, S 2 , • . . , S„, and that the probability of the 
system passing from the state Si to the state Sj by some well- 
defined random process is p,y. Clearly, the p,/s can be displayed 
as an (n,n) matrix. Such matrices are of great importance in 
the theory of probability and its physical applications. 

Since, as we pointed out above, both row and column 
matrices are called vectors, we have, in effect, accepted the 
following definition: 

DEFINITION 2 

An n-dimensional vector is an ordered set of n quantities. 

This use of the word vector requires explanation, since, at first 
glance, it appears to be somewhat at variance with the familiar 
usage of elementary physics. In physics, a vector quantity is a 
quantity, such as a velocity or a force, which possesses both mag- 
nitude and direction and, hence, can be represented by a directed 
line segment. However, it is clear that such a quantity is uniquely 
determined by its components in the directions of the three 
coordinate axes, and conversely. Hence, it can be uniquely 
associated with an ordered set of three quantities, i.e., with either 


418 


DETERMINANTS AND MATRICES 


CHAP. 10 


a row matrix or a column matrix containing three elements. A 
vector quantity in the physical sense is thus an example of a 
vector in the matric sense. However, the matric sense includes 
vectors other than physical vectors. In particular, any set of 
values of xi, x 2 , . . . , x n which satisfies a system of n linear 
equations in n variables can be thought of as a vector, and in 
our work we shall often speak of the solution vectors of such 
systems. 

By defining appropriately the addition and multiplication 
of matrices, an algebra of matrices can be developed. As we 
shall soon see, this is quite different from ordinary algebra, and 
for this reason, it is convenient to have a term to denote col- 
lectively those quantities, such as the variables and constants of 
our work up to this point, which obey the familiar laws of 
elementary algebra. These we shall henceforth refer to as scalars. 
For matrices, in contrast to scalars, then, we have the following 
definitions and rules of operation: 

The sum or difference of two matrices A and B having the 
same number of rows and the same number of columns is the 
matrix A ± B whose elements are the sums or differences of the 
respective elements of A and B. Obviously, if addition is com- 
mutative for the elements of A and B, it is also commutative 
for A and B themselves, and we have A + B = B + A. Simi- 
larly, if addition is associative for the elements of the matrices 
A, B, and C, it is associative for A, B , and C, and we have 
A -f (J3 + C) = (A + B) + C. Addition and subtraction are 
not defined for matrices which do not have the same number of 
rows and the same number of columns. 

The product of a matrix A and a scalar k is the matrix 
kA - Ak whose elements are the elements of A each multiplied 
by k. 

The scalar product of two vectors having the same number 
of elements, or components, is the sum of the products of cor- 
responding components of the two vectors. The scalar product of 
two vectors X and F is also referred to as the inner product or 
dot product of X and Y and is often denoted by the symbol X- F. 
Obviously, X- F = F*X. 

The coordinates of a point in three dimensions, say P : 

( x h Xi,X 3 ), form a vector X = Haq x 2 rc 3 |( in the matric sense, which 
is completely equivalent to the directed line segment from the 
origin 0 to the point P, thought of as a vector in the physical 
or geometric sense. Now from analytic geometry we know that 
the square of the length of the segment 0P is given by the formula 

(OP) 2 = z x 2 + x 2 * + s 3 2 ' 

But this is simply the scalar product X X of the vector X with 
itself. Hence, by analogy, the length or absolute value of any 


■C. 10.2 


ELEMENTARY PROPERTIES OF MATRICES 


419 


vector 

X = ||a;j x 2 • • • x„ || 

is defined to be the square root of the scalar product 
X ■ X - | X? 

A vector X with the property that 
X • X = £ z t - 2 = 1 

i=i 

is called a unit vector. 

From analytic geometry we also know that with every line l, 
there is associated a set of ordered triples 

(kltjklzfkh) h 0; U,h,h not all zero 

known as the direction numbers of the line. Moreover, if 

and (»ii,m 2| m 3 ) are, respectively, direction numbers of the lines 

l and m, then l and in are parallel if and only if the sets 

and (mi,wi 2 ,Wg) are proportional. Since the sets (Ji,h,h) and 

(mi,«7, 2 ,m 3 ) can obviously be thought of as vectors 

L = ||li l 2 l 3 |j and M = j|mi m 2 m g || 

it is natural to extend these ideas to vectors in general by saying, 

as a matter of definition, that two nonzero vectors 

X = .11*1*2 • • • ay|| and Y = \\yiy 2 • * • 2/»|| 

have the same direction if and only if their components are 

proportional. 

It is also a well-known fact of analytic geometry that, if 
( h,h,h ) and are, respectively, direction numbers of 

the lines l and m, then l and m are perpendicular if and only if 
hmi + hm 2 4- — 0 

But this is simply the condition that the scalar product L-M of 

the two vectors L and M be equal to zero. Hence, by analogy 

with this result, we agree that two nonzero vectors 

X = ||xi x 2 ■ ■ ■ x„\\ and Y = \\yi-yt • • - y n \\ 

will be called perpendicular, or orthogonal, if and only if they 

satisfy the condition 

X • Y = X x {yi = 0 

This extended concept of orthogonality is of fundamental impor- 
tance in many applications of matrices. 

Two matrices A and B are said to be conformable in the 
order AB if and only if the number of columns in A is equal to 
the number of rows in B. In other words, if A is an (m,n) matrix 


418 


DETERMINANTS AND MATRICES 


CHAP. 10 


a row matrix or a column matrix containing three elements. A 
vector quantity in the physical sense is thus an example of a 
vector in the matrie sense. However, the matric sense includes 
vectors other than physical vectors. In particular, any set of 
values of xi, % 2 , • ■ ■ , x n which satisfies a system of n linear 
equations in w variables can be thought of as a vector, and in 
our work we shall often speak of the solution vectors of such 
systems. 

By defining appropriately the addition and multiplication 
of matrices, an algebra of matrices can be developed. As we 
shall soon see, this is quite different from ordinary algebra, and 
for this reason, it is convenient to have a term to denote col- 
lectively those quantities, such as the variables and constants of 
our work up to this point, which obey the familiar laws of 
elementary algebra. These we shall henceforth refer to as scalars. 
For matrices, in contrast to scalars, then, we have the following 
definitions and rules of operation : 

The sum or difference of two matrices A and B having the 
same number of rows and the same number of columns is the 
matrix A ± B whose elements are the sums or differences of the 
respective elements of A and B. Obviously, if addition is com- 
mutative for the elements of A and B, it is also commutative 
for A and B themselves, and we have A -j- B = B + A. Simi- 
larly, if addition is associative for the elements of the matrices 
A, B, and C, it is associative for A, B, and C, and we have 
A -f (B + C) = {A + B) -+■ C. Addition and subtraction are 
not defined for matrices which do not have the same number of 
rows and the same number of columns. 

The product of a matrix A and a scalar k is the matrix 
kA = Ak whose elements are the elements of A each multiplied 
by k. 

The scalar product of two vectors having the same number 
of elements, or components, is the sum of the products of cor- 
responding components of the two vectors. The scalar product of 
two vectors X and Y is also referred to as the inner product or 
dot product of X and Y and is often denoted by the symbol X • Y. 
Obviously, X-Y ~ Y-X. 

The coordinates of a point in three dimensions, say P: 
(x h x 2 ,xz), form a vector X ~ jjan x 2 x 3 J| in the matric sense, which 
is completely equivalent to the directed line segment from the 
origin 0 to the point P, thought of as a vector in the physical 
or geometric sense. Now from analytic geometry we know that 
the square of the length of the segment OP is given by the formula 

(OP) 2 = xi 2 + z 2 2 + 

But this is simply the scalar product X-Xof the vector X with 
itself. Hence, by analogy, the length or absolute value of any 


SEC. 10.2 


ELEMENTARY PROPERTIES OF MATRICES 


419 


vector 

X = ||zi xz • • • ,r„!| 

is defined to be the square root of the scalar product 
X-X= f xS 

A vector X with the property that 
Z • X - J av* = 1 
is called a unit vector. 

From analytic geometry we also know that with every line l, 
there is associated a set of ordered triples 
(kli,kU,kl 3 ) k 7^ 0; h,h,l 3 not all zero 

known as the direction numbers of the line. Moreover, if (luh,h) 
and (nii,mt,m 3 ) are, respectively, direction numbers of the lines 
l and rn, then l and m are parallel if and only if the sets (h,h,h) 
and (mi,m s ,77iz) are proportional. Since the sets and 

(mi,m 2 ,w 3 ) can obviously be thought of as vectors 
L — \\h h Z 3 || and M = ||mi 

it is natural to extend these ideas to vectors in general by saying, 
as a matter of definition, that two nonzero vectors 
X = j|sci x 2 • • • a* || and Y = \\iji y 2 • • • y»|| 

have the same direction if and only if their components are 
proportional. 

It is also a well-known fact of analytic geometry that, if 
(hMa) and (mi,m 2 ,ra s ) are, respectively, direction numbers of 
the lines l and m, then Z and m are perpendicular if and only if 
hmi Z 2 m 2 + Um 3 — 0 

But this is simply the condition that the scalar product L-M of 

the two vectors L and M be equal to zero. Hence, by analogy 

with this result, we agree that two nonzero vectors 

X = ||xi a?»|| and Y = \\y t y 3 * • • s/„|| 

will be called perpendicular, or orthogonal, if and only if they 

satisfy the condition 

x - y = | xa , ~o 

This extended concept of orthogonality is of fundamental impor- 
tance in many applications of matrices. 

Two matrices A and B are said to be conformable in the 
order .41? if and only if the number of columns in A is equal to 
the number of rows in B. In other words, if A is an (m,n) matrix 


420 


DETERMINANTS AND MATRICES 


CHAP. TO 


and if B is a ( p,q ) matrix, A and B are conformable in the order 
AB if and only if n — p. With the idea of conformable matrices 
introduced, we are now in a position to define the important 
notion of the product of two matrices: 

DEFINITION 3 

If A is a (p,q) matrix and if B is a (g,r) matrix, so that A and B are conformable 
in the order AB, the prodxict C — AB is the ( p,r ) matrix in which the element 
Cij in the ith row and ;th column is the scalar product of the ith row vector of A and 

the ,/th column vector of J 5; i.e., C - AB is the matrix for which c i} - — X Qubit, ■. 

Multiplication is not defined for matrices that are not conformable . 


EXAMPLE T 



(2)(5') + ( 3)(— 6) (2)( —2) + ( 3X1) (2)(4) + ( 3)(-3) (2)(7) + ( 3)(0) 

- (l)(5)+(-lX-6) (l)(-2) + (~l)(l) (1)(4) +(-l)(-3) (1)(7) + ( — 1)(0) 

(0)(fi)+( 4)( — 6) (0)( —2) -f- ( 4X1) (0)(4)+( 4)(-3) (0)(7) + < 4)(0) 

-8 -1 -1 14 11 

- 11 -3 7 7 

-24 4 -12 o|| 

From Theorem 12, Sec. 10.1, and Definition 3, it is clear that 
the way in which we have defined the product of two matrices 
is precisely the way in which we proved that the product of two 
determinants can be formed. Hence, we have the following 
important result: 

THEOREM 1 

If A and B are square matrices of the same order, the determinant of the product 
AB is equal to the product of the determinant of A and the determinant of B. 

For the multiplication of matrices we have the following 
important theorems: 

THEOREM 2 

For suitably conformable matrices, multiplication is associative; i.e., 

A(BC) = (AB)C 

PROOF Let A be an (m,n) matrix, B an (n,p) matrix, and C a (p,q) matrix. 
For convenience, let BC — D, AB — E, and let 

A(BC) = AD - F and (AB)C ~ EC = G 

Now the element in the zth row and jth column of the matrix F = AD is, by 
Definition 3, 

fij — X Q’ikdkj 


SEC. 10.2 


ELEMENTARY PROPERTIES OF MATRICES 


421 


Moreover, since D = BC, we also have, by Definition 3, 

dkj — V bkici, 

i = i 

Hence, substituting, 

(1) fij ~ X a ik ( ^ bklCli) = £ a ikbklClj 

h~\ Z ~ 1 & = 1Z = 1 

where the last step follows from the fact that a ik is independent of the index l 
of the inner summation and, hence, can be moved across the inner summation sign. 

Similarly, the element g^ in the fth row andjth column of the matrix G = EC 
is 

P n 

Qij = V encij where e u - V a ik b k i 

1=1 4=1 

Hence, substituting, 

Qu - X ( X °ikbki) cij = V V ciikbkicij 

1=1 4=1 1 = 14=1 

Finally, interchanging the order of summation in the last double sum, we have 

np 

(2) ga = V J) aikbkicij 

4 * 1 1=1 

From (1) and (2) it is clear that /„ = g {j for all values of i and j. Hence, 
F — G\ i.e., 

A(BC) = (AB)C as asserted. 

It is interesting and helpful to note that the type symbol of 
the product of a series of conformable matrices can be obtained 
by “contracting” the type symbols of the factors by canceling 
the common interior indices: 

(mi.wt) 


EXAMPLE 2 


3 4 

( 1 2 

2 -i r 

\-J3 4 II 

II 2 5 - 

22 67 

2 1 

V 2 5 

0 3 | / 

/ II 2 1 II 

IU 13 

8 23 


11 26 
4 9 1 


22 67 
8 23 II 


From the definition of conformable matrices, it is evident 
that a matrix is conformable to itself if and only if it is square. 
Hence, a matrix A can be multiplied by itself if and only if it is 
square. In such a case the product A A is referred to as the square 
of A and is denoted by the symbol A 2 . Higher powers of A are 
defined in similar fashion, Theorem 2 guaranteeing that the 
definition 

A r = AA A 


1 factors- 


422 


DETERMINANTS AND MATRICES 


CHAP. ID 


it'd-' 


is unambiguous. With A 0 defined as the identity matrix I, it is 
obvious that, for any nonnegative integers r and s, the familiar 
laws of exponents 

A r A‘ ~ A r+S and (A r ) a — A n 
hold for matric multiplication. 

THEOREM 3 

For suitably conformable matrices, multiplication is distributive over addition; 
i.e., A(B + C) = AB + AC. 

Theorems 2 and 3 are “obvious”; that is, they assert prop- 
erties which we know to be true for products in elementary alge- 
bra and which, by analogy, we would expect to be true in matric 
algebra. That these results must be proved and cannot be taken 
for granted is clear, however, from the next two theorems, which 
tell us that two other equally simple properties of ordinary 
algebraic multiplication do not hold for matric multiplication : 

THEOREM 4 

The product of two nonzero matrices may be a zero matrix; i.e., the fact that 
AB — 0 does not imply that either A = 0 or B == 0. 

PROOF Clearly, to prove this theorem it is sufficient to exhibit two nonzero 
matrices whose product is a zero matrix, and we have, among infinitely many 
possibilities, 


6 

4 

2 0 

1 

-2 0 

0 

0 

9 

6 

3 • -1 

0 

3 = 0 

0 

0 

-3 

-2 

-1 2 

-3 

0 0 

0 

0 


THEOREM 5 

Even for matrices which are conformable in either order, multiplication is not 
commutative; i.e., in general, AB ^ BA. 

PROOF To prove this theorem it is sufficient to exhibit two matrices A and B 
such that AB BA, and we have, specifically, 

111 2 j| ]| 1 ill _ II 9 3 1| || 1 1 || || 1 2 |l || 4 6 1| 

| 3 4 ]' U 1 | 19 7 4 1 | 3 4 | 7 12 


In two special cases, however, the multiplication of matrices 
is commutative. Though these are simple and obvious, they are 
of sufficient importance to be stated explicitly: 

THEOREM 6 

Both unit matrices and zero matrices commute with all suitably conformable 
matrices; more specifically, AI ~ IA = A and AO — OA - 0. 

Since matric multiplication is not, in general, commutative, 
it is desirable to be able to describe concisely the order in which 
two conformable matrices are to be multiplied. This we shall do 


SEC. 10.2 


ELEMENTARY PROPERTIES OF MATRICES 


423 


by adopting the following terminology: In the product AB we 
shall say that .4 premultiplies B, or B is premultiplied by .4 , and 
B postmultiplies A , or A is postmultiplied by B. 


THEOREM 7 

The transpose of the product of two conformable matrices is equal to the product 
of the transposed matrices taken in the other order; i.e., ( AB) T — B T A r . 

PROOF Let A be a (p.q) matrix and let B be a (q,r) matrix. Then from the 
definition of the transpose of a matrix it follows that the element in the ith row 
and jth column of (AB) T is the element in the ith row and ith column of AB, 
namely, 

(3) ^ djkbki 

On the other hand, the ith row of B T is, by definition, the ith column of B\ i.e., 
the ith row of B T consists of the elements 

bu, bn, . . . , b g i 

Similarly, the jth column of A T is, by definition, the ith row of A; i.e., the jth 
column of A T consists of the elements 

djl, 0/2} ■ • • ) djq 

Hence, the element in the ith row and ith column of B T A T is 
a g 

y bkidjk — y djkbki 

k = i *=i 

Since this is the same as the expression (3) for the corresponding element in the 
matrix (AB) T , the theorem is established. 

COROLLARY 1 

The transpose of the product of any number of conformable matrices is the 
product of the transposed matrices taken in the other order; i.e., 

CAiA, • • • A n y - AJ ■ • • AMi T 

The definition of a matrix in no way rules out the possibility 
that the elements of a matrix are themselves matrices. In fact, 
it is often convenient to subdivide, or partition, a matrix into 
submatrices and then regard the original matrix as a new matrix 
having these submatrices as elements. In particular, it is fre- 
quently helpful to regard an (m,n) matrix A = ||a;,|| as a row 
matrix j|Ci C 2 • • ■ C7„||, whose elements are the respective 
column vectors of A, or as a column matrix 

I Ri 

I R m 

whose elements are the respective row vectors of A. 


424 


DETERMINANTS AND MATRICES 


CHAP. 10 


EXAMPLE 3 

For instance, among numerous other possibilities, we can write 


where An = 
or equally well 


I fflii a is ojs | an 
A = Oji an an 1 034 

|| an tts 2 #33 j #34 

On a« ai3 » = 11 a * 4 

II 021 022 023 II II O24 

A = M11 An An A u \\ 






032 O 33 II A 22 — j |«34 II 


An = 


O14 

au 

034 


In constructing the product of two matrices it is sometimes 
convenient to partition them before performing the multiplica- 
tion. This can be done in many ways, but it is of course necessary 
that the given matrices be conformable and that the various 
submatrices which must be multiplied together also be con- 
formable. This requirement imposes no restriction on the hori- 
zontal partitioning of the first matrix or on the vertical parti- 
tioning of the second matrix. It does require, however, that the 
columns of the first matrix be partitioned into groups such that 
the number of columns in each group is equal to the number of 
rows in the corresponding partition of the rows of the second 
matrix. Matrices for which this is the case are said to be con- 
formably partitioned. 


EXAMPLE 4 


By direct multiplication we have 

|| 1 1 1 || || 1 2 3 -1 || || 4 1 2 0 

2-10-3-1 1 0 = -1 5 5 -2 

|| -1 0 2 1| || 0 0 -2 l|| || -1 -2 -7 3 

On the other hand we can write, among various other possibilities, 



and, from this point of view, the product of the two matrices is 

| [An An |b jhBn || = || AnBn + AnBn || 

IU21 An || ' || B n || || AnB n + A n Bn || 

or, performing the indicated multiplications and additions of the submatrices, 

|| 4 1 4 — i || , || 0 0 —2 1 || || || 4 1 2 0 

II -1 5 5 —2 1| li 0 0 0 0 1| || = — 1 5 5-2 

II -1 -2 -3 1 II -HI 0 0 -4 2 II II II -1 -2 -7 3 

as before. 


SEC. 10.2 


ELEMENTARY PROPERTIES OF MATRICES 


425 


Historically, the definition of the product of two matrices 
was introduced by the English mathematician Arthur Cayley 
( 1821 - 1895 ) as a result of his investigations on linear transforma- 
tions. By a linear transformation we mean a relation of the form 

2/1 = Oll^l + 012212 +••'■+ UlnXn 
rp . 2/2 — « 2 l 2 Jl -}" CLzzXi + " ’ * + 


2 In — a n iXi + a n2 x 2 +'•'■+ d nn x„ 
connecting the variables (xi,Xi, . . . ,x n ) and the variables 
(2 /ij2/2, • • • ,y n ) in which the ay are constants independent of 
(aii, . . . ,x n ) and (y h ... ,y n ). If n = 2 , we can think of T a 
as a transformation of the cartesian plane which sends a point 
with coordinates (x i,x 2 ) into a point with coordinates (2/ 1,2/2) - 
Similarly, if n = 3 , we can think of T a as a transformation which 
sends a point with coordinates (xi, 212,2:3) into a point with coordi- 
nates (2/ 1,2/2, y 3). If n > 3 , we can regard T a as a transformation 
in a hyperspace of the appropriate number of dimensions, or we 
may think of it simply as a transformation of an ^-component 
vector X into an n-component vector Y. From the definition 
of matric multiplication and the equality of two matrices, it is 
clear that if we introduce the matrices 



2 /i 


dn 

dl 2 ’ 

' ' din 


Xi 

Y = 

2/2 

A = 

«21 


‘ ‘ d 2 n 

X = 

x 2 


Vn 


d n l 

d n2 

dnn 




then the transformation T a can be written in the form 
( 4 ) T a -. Y — AX 

The matrix A in Eq. ( 4 ) is usually referred to as the matrix 
of the transformation T a • It is thus apparent that matrices are 
intimately related to linear transformations and systems of 
linear equations. 

Suppose, now, that, in addition to the transformation T a 
which transforms a vector X into a vector Y, we have a second 
transformation 




Zl 


b n 

612 • • 

n 

(5) 

T b : Z — BY Z = 

Zl 

B = 

bn 

2)22 ' ■ 

bzn 



Zn 


b n 1 

bnl ■ ■ 

• bnn 


which transforms a vector Y into a vector Z. If T a is applied to a 
vector X and then is applied to the resulting vector Y, the 
net result is to transform the vector X into the vector Z, and it 
is a matter of some interest to find the equations of the equiva- 
lent transformation connecting X directly with Z. This can easily 
be done, of course, simply by eliminating the variables (2/1, 2/2, 
. . . ,?/») between the equations of T a and T b . To do this we 


426 

DETERMINANTS AND MATRICES 

CHAP. 10 


observe that the equations of TV and TV 

can be written respectively 


TV: Vk ~ X A = 1, 2, . . 

3 = 1 

. , n 


TV: Zi — X bikl/k i — 1, 2, . . 

*=i 

. , n 


Hence, eliminating the y’s by substituting for ijk in the equations 
of TV , we have 

2* = X hk ( X dkjXj) - 2 ( X &a®w) x 3 I - 1,2, ... ,n 

k - 1 3 = 1 3' = 1 fc = l 

Thus the coefficient of x, in the equation for Zi is 
X bikdk} 

k = 1 

which is precisely the element cy in the product BA. In other 
words, the matric form of the single transformation TVs equiva- 
lent to following TV by Tb, may be found simply by eliminating 
Y between Eq. (4) and Eq. (5) : 

Z BY = B(AX) - (BA)X 

Thus we have established the following important result: 


THEOREM 8 

The result of following a linear transformation T a \ Y = AX with the transforma- 
tion T b : Z - BY is the single transformation TV- Z - BAX, whose matrix is 
the product BA of the matrices of T b and TV 

As a further illustration of the importance of matric multi- 
plication, we return to the idea of transition probabilities which 
we mentioned earlier in this section. Suppose that a system S 
can exist in any of n states Si, Si, . . . , S n , and that by some 
random process the system may pass directly from the «th state 
to the jth state with probability — 1, 2, . . , , n). 

Naturally, the system may also pass from the tth state to the jth 
state by first passing to some intermediate state, say the kth, 
and then passing from the Icth state to the jth ; and the calcula- 
tion of these two-step transition probabilities is a matter of some 
importance. Now the probability that the system passes from 
the fth state to the jth state via the kth state is the product of 
the probability that it passes in one step from Si to Sk and 
the probability p™ that it then passes in one step from Sk to 
Sj. Furthermore, since in any two-step transition from Si to Sj 
the system must pass through some intermediate state (including, 
of course, Si and Sj themselves) the probability pjf that the 
system passes in exactly two steps from Si to S, is 


SEC. 10.2 ELEMENTARY PROPERTIES OF MATRICES 


427 


But the sum on the right in the last formula is precisely the ele- 
ment in the ith row and jth column of the square of the matrix 
of one-step transition probabilities, P = jjp^jj. In other words, 
the matrix of two-step transition probabilities for any system S is 
the square of the matrix of one-step transition probabilities for S. 
A similar argument shows that the matrix of three-step transi- 
tion probabilities for S is the cube of the matrix of one-step' 
transition probabilities, and so on. 


EXAMPLE 5 

Let 8 be the system consisting of two players A and B who begin with two dollars apiece and 
match coins until one or the other has no more money. If the states of the system are defined 
by the number of dollars in .d’s possession; specifically, if the system is in the state $,•+ , whenever 
A has i dollars (i — 0, 1, 2, 3, 4); find the matrix of one-step transition probabilities. Then, by 
raising this matrix to the second, third, and fourth powers, find the matrices containing the two-, 
three-, and four-step transition probabilities for S. What is the probability that A will be ruined 
in at most four turns? What is the probability that A will be ruined in exactly four turns? 

Clearly, unless A or B is bankrupt, A must either win a dollar or lose a dollar on each turn, 
and the probability of each of these events is Y. Hence, if A has i dollars {i = 1, 2, 3), that is, 
if the system is in the state Si+i (?• = 1, 2, 3), the probability of a one-step transition to Si is Yi, 
the probability of a one-step transition to S, + a is Yi, and the probability of a one-step transi- 
tion to any other state is zero. On the other hand, if the system is in the state /Si, that is, if A is 
bankrupt, the system remains in that state; so the probability of a one-step transition from /Si to 
Si is 1, and the probability of any other transition from Si is 0. Similarly, if the system is in 
the state Ss, that is, if B is bankrupt, the system remains in that state; hence the probability 
of a one-step transition from Ss to S s is 1, and the probability of any other transition is 0 . Thus 
the matrix of one-step transition probabilities is 


I 

0 

0 

0 

0 

X 

0 

X 

0 

0 

0 

X 

0 

X 

0 

0 

0 

X 

0 

X 

0 

0 

0 

0 

l 


By multiplying this matrix by itself we find at once that the matrix of two-step transition 
probabilities is 


|'i 

0 

0 

0 

0 

\X 

X 

0 

X 

0 

o 

X 

0 

X 

0 

X 

0 

X 

X 

1 0 

0 

0 

0 

l 


Similarly, by computing P 3 and P 4 , we find the matrices of three-step and four-step transition 
probabilities, respectively, to be 


i 

0 

0 

0 

0 


1 

0 

0 

0 

0 

X 

0 

M 

0 

X 


X 

X 

0 

X 

X 

X 

X 

0 

X 

X 

and P 4 = 

% 

0 

X 

0 

% 

X 

0 

X 

0 

% 


X 

X 

0 

X 

Vs 

0 

0 

0 

0 

l 


0 

0 

0 

0 

1 


The probability that A is ruined in at most four turns is simply the probability of a four-step 
transition from Ss to Si, namely, pfl — %, since among such transitions are included those in 
which the system reaches Si in less than four steps and then remains there. The probability 


428 


DETERMINANTS AND MATRICES 


CHAP. 10 


that A is ruined in four turns and not before is the probability that S reaches Si in four steps but 
does not reach it in three steps or less, namely, — pff — % — M - 3-6- 



EXERCISES 

Multiply the matrices 

!> 

,1 3 0 

using the indicated partitioning. Check by multiplying without regard to the partitioning. 
Evaluate the matric polynomial X* — 4X® — X + 47 for each of the following matrices: 
a 111 -111 b 1 1 2 c II 0 1 1 

I! 2 01! 12 1 -10 1 

2 11 || -1 -1 0 

Verify that ( X - 31) (X - 27) = (X - 2 7)(X - 37) - X* - 5X + 67 for 

12 0 
0 3 0 
0 0 4 

Do you think that this relation is an identity for all square matrices X? 

Show that II “S® “ n# |".|| 008 »' ”*|| 

|| — sm 0 cos 0 1| || ~ sm nO cos nO || 

Show that I! C0S ^ 6 6 II" - II C0S ^ n0 n6 1| 

|| sinh 6 cosh 6 || — || sinh nO cosh nO || 

Show that, if a matrix A commutes with a diagonal matrix D whose diagonal elements are 
all distinct, then A is a diagonal matrix. Is A necessarily diagonal if the diagonal elements of 
D are not all distinct? 

If A and B are conformable matrices, show that the ith row vector in the product AB is 
AiB where A{ is the tth row vector of A. What is thejth column vector in the product AB? 

1 2 -ill 

If A = 2 1 3 , find a nonzero 3X3 matrix X such that AX is a zero matrix. 

4 5 l|| 

Is XA = 0? Is X unique? 

If A and B are symmetric matrices of the same order, prove that the product AB is sym- 
metric if and only if AB = BA. 

If A and B are two square matrices which commute and if r and s are positive integers, prove 
that A r and B* also commute. 

Prove Theorem 3. 

Prove Corollary 1, Theorem 7. 

Prove that (A + B) r = AT + B T . 

If K is a diagonal matrix whose diagonal elements are all equal to k, prove that the product 
of K and any conformable matrix A is equal to the product of A and the scalar k. Because of 
this property the matrix K is often referred to as a scalar matrix. 

By definition, the transpose of the matrix A = ||A,-,-|| is the matrix A T = ||oy,||. Is this 
formula correct if the a,/ s are submatrices of A? 

By the derivative of a matrix A we mean the matrix whose elements are the derivatives 
of the elements of A. Assuming that the elements of the matrices A and B are differentiable 
functions of x, use this definition to show that d(AB)/dx ~ ( dA/dx)B + A(dB/dx). Is 
dAydx = 2 A(dA/dx)? 

Show that in any matrix of transition probabilities, the sum of the elements in any row is 1. 



j i n 

3 1 

-7 "of an{l 

1 3 

i 2 1| 

2 0 


SEC, 10.3 


ADJOINTS AND INVERSES 


429 


18 Consider the system S consisting of four boxes B\, B 2 , B 3 , and B 4 and a single ball, and let 
the system be in the state Si ( i = 1, 2, 3, 4) if the ball is in the box Bi. Transitions from one 
state to another take place in the following manner: A die is thrown and, if a 1, 2, or 3 turns 
up, the ball is taken from whichever box it may have been in and placed in the box bearing 
the number showing on the die. If 4, 5, or 6 turns up, the ball is taken from wherever it may 
have been and placed in the box B 4 . Find the matrix of one-, two-, and three-step transition 
probabilities for the system. 

19 Consider the system S consisting of three boxes B u B«, and B 3 and a single ball, and let the 
system be in the state Si (i — 1, 2, 3) if the ball is in the box Bi. Transitions from one state 
to another take place in the following manner: Three coins are tossed. If no heads turn up, 
the ball is not moved, but, if one or more heads turn up, the ball is taken from whichever 
box it may have been in and placed in the box corresponding to the number of heads show- 
ing. Find the matrix of one-, two-, three-, and four-step transition probabilities for the 
system. 

20 Show that, in computing the product of a (p,q) matrix and a (q,r) matrix, pqr multiplications 
and pr(q — 1) additions must be performed. If a (p,q) matrix and a (q,r) matrix are conform- 
ably partitioned into four submatrices by one partition of their rows and one partition of 
their columns, prove that the same number of multiplications and the same number of 
additions are required in multiplying the two matrices whether this is done in the original 
or in the partitioned form. 


10.3 

Adjoinfs and inverses 

It is a familiar fact of elementary algebra that any quantity Q 
not equal to zero has a reciprocal 



with the property that 

QQ-l = Q-iQ « 1 

The familiar process of division, which we sometimes inaccurately 
regard as being essentially independent of multiplication, is 
nothing but multiplication involving the reciprocal, or multi- 
plicative inverse, of the divisor as one factor. In matrie algebra, 
although we do not define division as such, we can in an important 
class of cases define the reciprocal, or inverse, of a matrix. With 
reciprocals defined, multiplication then serves to accomplish all 
we might properly expect to do by division. As usual our develop- 
ment begins with a number of definitions: 

We have already defined the determinant of a square matrix 
as the determinant whose array of elements is identical with the 
array of the matrix itself. Clearly, only square matrices have 
determinants. A square matrix whose determinant is different 
from zero is said to be nonsingular. A square matrix whose 
determinant is equal to zero is said to be singular. Using these 
notions, we can now give formal definitions of the important con- 
cepts of the adjoint and the inverse of a matrix. 


430 


DETERMINANTS AND MATRICES 


CHAP. 10 


DEFINITION 1 

If A = \\ 0 ii is a square matrix and if Ay is the cofactor of ay in the determinant 
of A, then the matrix 

11-4*11 = \\Ay\\ T = transpose of ||A f/ || 
is called the adjoint of the matrix A. 

The adjoint of a square matrix A is sometimes indicated by the 
notation adj A. 


DEFINITION 2 

The reciprocal, or inverse, A' 1 of a nonsingular matrix A — jjcz^H is the adjoint 
of A divided by the determinant of A; i.e., 


_ 11M! - IM 

A - \A\ - \A\ 

Clearly, although every square matrix has an adjoint, only 
nonsingular matrices have inverses. 

The fundamental importance of the reciprocal, or inverse, 
of a matrix is apparent from the following theorem : 


THEOREM 1 

The product of a nonsingular matrix A and its reciprocal in either order is a unit 
matrix; i.e., A _1 A — A A -1 - I. 


PROOF Let A = |[a y || be a nonsingular matrix and consider first the product 


A n 

An ■ 

Ani 

an 

a 12 ■ ' 

' ‘ a ln 

A. n 

A 22 

■ ■ A n 2 

a n 

an • ' 

' ‘ «2n 

A in 

A i n • 

A nn 

a n i 

a n 2 • ' 

1 ■ a nn 


Clearly, from the definition of matric multiplication, the element in the zth row 
and jth column of the product of the two matrices on the right is the scalar 
product 

2 ) Akidkj 
&S 1 

Moreover, from Corollary 1, Theorem 11, Sec. 10.1, this sum is equal to | A | if 
i - j and is equal to zero if i j. Hence, 


141 

0 ■ 

• o 


1 

0 • 

• 0 

0 

|A| ‘ 

• 0 

~ 

0 

1 • 

• 0 

0 

0 • 

' • 141 


0 

0 ■ 

• • 1 


asserted. 


The proof that AA~ l — I follows in exactly the same fashion. 

COROLLARY 1 

For any square matrix, (adj A)A = A (adj A) = \A\I. 

COROLLARY 2 

If A is a nonsingular matrix, then A -1 is also nonsingular and j A'~ l | = 1/|A}. 


SEC 10.3 


ADJOiNTS AND INVERSES 


431 


If 


EXAMPLE 1 

I 1 
A = -1 

I 3 


then the determinant of A is 



1 2 4 

-1 0 3 =7 

3 1 -2 

The adjoint of A is the transpose of 

|0 3 I _ I -1 3 | 

I 1 —2 1 I 3 -2 1 

-I 2 4 | I 1 4 i 

[ 1 —2 1 I 3 -2 1 

I 2 4 | _ I 1 4 | 

|0 3 | I -1 3 1 


1-1 0 | 
I 3 1 1 
| 1 2 1 
I 3 l| 

I 1 2 I 

[-1 o| 



The inverse of A is, therefore, 


-3 

8 

6 

A -1 - K 7 

-14 

-7 

-1 

5 

2 


-3 8 6 

7 -14 -7 


1 

-1 

2 4 

0 3 

= M 

7 0 0 
0 7 0 

= 

1 0 0 
010 

-1 5 2 


3 

1 -2 


0 0 7 


0 0 1 


From Theorem 1 it is clear that, if A is a nonsingular matrix, 
then each of the equations AX — I and XA — I has X = A~ l as 
one solution. Actually, we have the following stronger result: 


THEOREM 2 

If A is a nonsingular matrix, then X = A -1 is the unique solution of each of the 
equations AX — I and XA — I. 

PROOF Consider first the equation AX — I. If both X x and X 2 satisfy this 
equation, then AX% — AX 2, since each is equal to I. Moreover, since A is non- 
singular, it follows that A -1 exists. Hence, premultiplying by A -1 , we have 
A-'AXx = A~ 1 AX 2 
IX 1 = IX2 
Xx = x 2 

Thus the equation AX = I has in fact just one solution, and from Theorem 1 
it follows that this solution is X — A -1 . A similar argument shows that X — A -1 
is also the unique solution of the equation XA = /, as asserted. 

COROLLARY 1 

If A is a nonsingular ( n,n ) matrix, if B is an ( n,m ) matrix, and if C is an ( m,n ) 
matrix, the equation AX — B has the unique solution X = A^B, and the 
equation XA = C has the unique solution X = CA~K 


432 


determinants and matrices 


CHAP. 10 


Various other important theorems follow easily now that 
the uniqueness of the solution of AX = I for any nonsingular 
matrix A has been established. In particular, we have the follow- 
ing results: 

THEOREM 3 

If A is a nonsingular matrix, then (A -1 ) -1 = A. 

PROOF By Theorem 2, (A -1 ) -1 is the unique solution of the equation 
A~ X X = I. However, it is obvious by inspection that X — A satisfies this 
equation. Hence (A -1 ) -1 — A, as asserted. 

THEOREM 4 

If A and B are nonsingular (n,n) matrices, then ( AB)~ X — B~ X A~ X . 

PROOF By Theorem 2, (AJB)" 1 is the unique solution of the equation 
{AB)X = I. However, it is clear that X = B~ 1 A~ 1 satisfies this equation, since 

{AB){B~ X A~ X ) = A(BB' X )A~ X = AIA~ X = AA~ X = I 

Hence, (AB)~ X — B~ X A~ X } as asserted. 

COROLLARY 1 

I f A, B, . . . , K are nonsingular ( n,n ) matrices, then 

(AB • ■ • K)~ x = K- 1 • • • B- l A~ l 

With the inverse of a nonsingular matrix defined, it is now 
possible to define negative integral powers of any nonsingular 
matrix: 

DEFINITION 3 

If A is a nonsingular matrix and if r is a positive integer, then A~ r — (A -1 ) r . 

Negative powers of singular matrices are not defined. 

For nonsingular matrices it is now possible to extend the 
familiar laws of exponents to negative as well as nonnegative 
integral poivers: 

THEOREM 5 

If A is a nonsingular matrix, then, for all integral values of r and s, A r A° = A r+S 
and (A r ) s = A rs . 

If in the corollary of Theorem 2 we take B to be the column 
matrix 


SEC. 10.3 


ADJ01NTS AND INVERSES 


433 


the matric equation AA = B is equivalent to the system of 
nonhomogenous linear equations 

a,nXi + aizXz + • • • + ai„x„ = 61 

ttsiXl + dzzXz + • • • + a,2n%n ~ &2 


a» 1X1 + a.n2X« + • • • + a nn x n — 6„ 

Hence, it is clear from Corollary 1, Theorem 2, that the solution 
of this system exists and is uniquely given by 
X = A~ l B 


if A is nonsingular. 

Similarly, if we take B to be the matrix 

I Vl 

y ! 2/2 

I Vn 

the matric equation AX — B becomes the linear transformation 


FIGURE 10.1 
A typical system 
of spring- 
connected 
masses. 


Y = AX 


The corollary of Theorem 2 now assures us that if A is non- 
singular the inverse of this transformation, that is, the transfor- 
mation that carries us back from the vector Y to the vector X, 
is unique and has the equation 

X = A~ l Y 

As a physical illustration of the relation of a matrix and its 
inverse, let us consider the mass-spring system shown in Fig. 10.1 
and determine the forces which act on each of the masses as a 
result of arbitrary displacements X\, x lt and £3 of the respective 
masses. The modulus of each spring is the indicated value of k ; 
that is, the force required to stretch each spring a unit distance 
is the corresponding value of k. Now, if the masses are displaced 
by the respective amounts Xi, Xz, and the increases in the 
length of the various springs are 

k\. X\ 

k n : x-t — xi 

ki $: x 3 — xi 

kzz*» X3 Xz 

k 3 : — X 3 



x % 




DETERMINANTS AND MATRICES 


and the forces represented by these changes in length are 


( 1 ) 


( 2 ) 


k x : 

k x xi 

k x2 i 

k x2 (x 2 Xi) 

kn'. 

kn(x 3 — x x ) 

kn- 

kn(x 3 — x 2 ) 

k 3 : 

h( — x 3 ) 


a positive force indicating that the spring is stretched and a 
negative force indicating that the spring is compressed. Hence, 
taking due account of the direction of the force applied to each 
mass by each spring attached to it, we find that the forces fa, fa, 
and/ 3 which act on the respective masses are 
fa = -hxi -f ku(x 2 — Xi ) + kn(x 3 - x x ) 
fa = -kufru - x x ) -4- k 23 (x s — x 2 ) 

/s = ~ku(x 3 — x x ) k 23 (x 3 ~ x 2 ) — k 3 x 3 

or, collecting terms and rewriting in matric notation, 

F = KX 
where 


fa 

\x x 

fa X-| 

x 2 

fa 1 

j Xi 


and 

II - (ki + kn + k X3 ) k x2 k X3 II 

K ** I k X2 — (/cis + /C23) kn |j 

1 kn k 23 ~~(kn + kn + k 3 ) |j 

Evaluating the first of Eqs. (1) for x x - 1, x 2 — 0, x 3 = 0, it is 
clear that — (ki + k x2 + kn) is the force applied to the first 
mass as a result of a unit displacement of that mass. A similar 
evaluation shows that, in general, the element in the ith row 
and jth column of K is the force applied to the ?'th mass as a 
result of a unit displacement of the jth mass. Because of this 
property the matrix K is usually referred to as the stiffness matrix 
of the system. 

It can easily be verified that, for all positive values of the 
k’s, the matrix K is nonsingular. Hence, for the physical system 
shown in Fig. 10.1, K~ l exists, and we can solve Eq. (2) for X, 
getting 

X = K~ l F 

Now, evaluating the right-hand side of this equation for a force 
vector with one component 1 and the rest 0, it follows that the 
element of K~ l in the fth row and jth column is the displacement 
produced in the fth mass as a result of a unit force applied to the 
Jth mass. Because of this property, the matrix K~ x is usually 


SEC. 10.3 


ADJOINTS AND INVERSES 


435 


referred to as the elasticity matrix* of the system. Our discussion 
has thus illustrated the important fact that for any elastic system, 
the elasticity matrix is the inverse of the stiffness matrix, and vice 
versa. 

In the last section we defined a number of special matrices, 
and now, with the concept of the inverse of a matrix available, 
our list can be extended to include several additional important 
types. In the following table we bring together the types we have 
already defined as well as the new ones we are here introducing: 


table 10.1 


A 

A — conjugate of A 

A T — transpose of A 

A T = associate of 4 

.i -1 = inverse or reciprocal of A ( A nonsingular) 

Condition on A 

Type 

A~A 

Heal 

A = —A 

Imaginary 

A — A T ' 

Symmetric 

A - -A T 

Skew-symmetric 

A m Ar 

Hermitian 

A - -A T 

Skew-hermitian 

A - (A*’)-*; i.e., A~ l = A T or AA T - I 

Orthogonal 

A - (A r ) _1 ; i-e., A- 1 = A T or AA t = I 

Unitary 


Although we cannot go into the details of the matter, it is 
worth noting that orthogonal matrices derive their name from 
the fact that the matrix of a transformation which is a rotation 
of mutually perpendicular, or orthogonal, axes in two or three 
dimensions is always orthogonal. For instance, it is well known 
that in the cartesian plane the equations of a general rotation 
of axes are 

Xx = x\ cos a. + x 2 sin a 
J . or X — AX 

x 2 = — X\ sin a + #2 cos a 

where X' = A — 

II * 2 1| 

and it is easy to show that A is orthogonal by verifying that 
AA T = I. 


cos a sin a [1 


— sin a cos a II 

ii** 


* The symmetry of the stiffness and elasticity matrices, which asserts, for 
instance, that the force acting on the ith mass as a result of a unit displace- 
ment of the jth mass is equal to the force acting on the j'th mass as a result 
of a unit displacement. of the ith mass, is an illustration of the famous reci- 
procity theorem of Maxwell-Rayleigh-Betti. This theorem is the counterpart 
in mechanics of what is known simply as the reciprocity theorem in electrical 

circuit analysis. 


436 


DETERMINANTS AND MATRICES 


CHAP. 10 


1 Prove Corollary 1, Theorem 4. 

2 Find the adjoint of each of the following matrices, and, when it exists, find the i; 


3 a Under what conditions, if any, does AB = AC imply B = Cl 

b If A is a nonsingular matrix, show that AB — 0 implies B = 0. 

4 If A is a nonsingular matrix which commutes with a matrix B, prove that A~ l commutes 
with B. If B is also nonsingular, do A~ l and B~ l commute? 

5 If D is a nonsingular diagonal matrix, prove that D~ l is also a diagonal matrix and that each 
element on the principal diagonal of D~ l is the reciprocal of the corresponding element in D. 
Does a similar result hold for nonsingular triangular matrices? 

6 If A is a singular matrix, prove that the product of A and its adjoint is a null matrix. 

Xi — Xi + 2*3 = 1 

7 Solve the system 2xi — x» = 2 by multiplying both sides of the equivalent 

xi +• Xi + xs = 3 


matric equation AX « B by the inverse of the matrix of the coefficients, A.~ h 

8 If .4 is a nonsingular matrix, show that the determinant of the adjoint of A. is equal to the 
(n — l)st power of the determinant of A. 

9 If A is a nonsingular matrix, show that the adjoint of the adjoint of A is equal to A times 
the (n — 2)nd power of the determinant of A. 

10 Prove that the determinant of any orthogonal matrix is either 1 or — 1. Is the converse true? 

11 Prove that a real matrix is orthogonal if and only if its column vectors are unit vectors which 
are mutually orthogonal. 

12 Prove that, if the column vectors of a real matrix A are mutually orthogonal unit vectors, 
so are the row vectors of A . 

13 Show that, for all positive values of the k’s, the matrix K in Eq. (2) is nonsingular. 

14 Find the stiffness and elasticity matrices for the system shown in Fig. 10.2. 





16 In mechanics it is shown that, if a cantilever beam bears a concentrated load P at a dis- 
tance s from the fixed end, then the deflection y at a distance x from the fixed end is given 
by the formula* 

( PxKx - 3s) 

_ ~6E/ lS! 

* ) Ps*(j - 3x) 

\ 6EJ 


* The theory required for the derivation of this result is summarized in 
Sec. 2.6, 


SEC. 10.4 


RANK AND THE EQUIVALENCE OF MATRICES 


437 


where E and I are physical constants of the beam. Using this formula, obtain the stiffness 

Lt 21/ 

and elasticity matrices relating the forces and deflections at the positions s = — > — > L 

and x = ~ ’ — } L. In what respect, if any, does this problem differ significantly from 
the example discussed in the text? 

10.4 

Rank and the equivalence of matrices 

One of the most important characteristics of a matrix is its rank : 
DEFINITION 1 

The rank of a matrix A is the largest value of r for which there exists an (?•,?’) sub- 
matrix of A with nonvanishing determinant.’ • 

The rank of a matrix A, as we have just defined it, is sometimes 
referred to more specifically as the determinant rank of A. 
Clearly, as an immediate consequence of Theorem 1, Sec. 10.2, 
we have the following simple but useful result: 


THEOREM 1 

If A an d B are two (n,n) matrices of rank n, then both AB and BA are of rank n. 


The matrix 


1 

3 


EXAMPLE 1 
2 -1 
4 0 

j 0 -2 
2 -1 
4 0 

0 -2 


3 

-1 

7 

3 

-1 

7 


of rank 2, since each of the third-order submatrices 



is singular while not all second-order submatrices are singular. Specifically, the determinant of 
the 2X2 submatrix in the upper left-hand corner is different from zero. 


In working with matrices it is frequently necessary to 
consider the effect of performing upon them certain simple 
manipulations known as elementary transformations : 

DEFINITION 2 

An elementary transformation of a matrix is any one of the following operations: 

a The multiplication of each element of a row or a column by the same nonzero 
constant 

b The interchange of two rows or of two columns 

c The addition of any multiple of the elements of one row, or one column, to 
the corresponding elements of another row, or column, respectively 

The most important property of elementary transformations 
is contained in the following theorem: 


438 


DETERMINANTS AND MATRICES 


CHAP. 10 


THEOREM 2 

The rank of a matrix is not altered by any sequence of elementary transformations. 

PROOF Let A be an arbitrary matrix, and let r be its rank. Then every minor 
of A of order greater than r is zero, and at least one minor of order r is different 
from zero. To prove the theorem it is clearly sufficient to prove that no elemen- 
tary transformation can change the rank of A. 

Consider first an elementary transformation of type a. By Theorem 6, Sec. 
10.1, such a transformation cannot affect the vanishing or nonvanishing of any 
minor of A \ hence it cannot alter the rank of A. 

A transformation of type b, on the other hand, may affect the vanishing or 
nonvanishing of the minor in some particular position in A. However, after a 
transformation of type b every submatrix in A exists somewhere in the resulting 
matrix, with at most two rows or two columns interchanged. Hence, by Theorem 
7, See. 10.1, if all minors of A of order greater than r are zero, the same thing will 
be true after the transformation is performed ; and if at least one rth-order minor 
of A is different from zero, the same thing will be true after the transformation. 
Thus no transformation of type b can alter the rank of A. 

Finally, no transformation of type c can alter the rank of A. For consider 
the transformation consisting of modifying the elements of the jth row by adding 
to them some multiple X of the corresponding elements in the zth row. (The case 
of column modification is handled by an identical argument.) Clearly, some of the 
(r + l)st-order minors of A are unaffected by this transformation. Specifically, 
any ( r + l)st-order minor involving neither the z'th nor the jth rows, both the 
zth and the jth rows, or just the z'th row will surely be unaffected, i.e., will be left 
equal to zero. On the other hand, the value of an (r + l)st-order minor involving 
the jth row but not the z‘th may conceivably be affected, since one of its rows 
(the jth) is modified by means of a row of elements (the z'th) from outside the 
minor. However, by the addition theorem for determinants (Theorem 9, Sec. 10.1) 
the modified determinant can be written in the form 

i-Sil + X|$2| 

where Si and $2 are square submatrices of A of order r -f 1 and hence singular, 
by hypothesis. Thus no vanishing (r + l)st-order minor of A can be transformed 
into one which is different from zero by a transformation of type c ; that is, no 
transformation of type c can increase the rank of A. On the other hand, no trans- 
formation of type c can decrease the rank of A, either. For if this were the case, 
then the inverse transformation, that is, the transformation consisting of adding 
— X times the elements of the z'th row to the corresponding elements of the jth 
row in the new matrix, would be a transformation of type c which restored the 
matrix to its original form and hence increased its rank to the original value r; 
and this we have just proved to be impossible. Thus the rank of A, being neither 
increased nor decreased by a transformation of type c, is invariant under this 
transformation also, and our proof is complete. 

DEFINITION 3 

Two matrices A and B, one of which (and hence either of which) can be obtained 
from the other by a series of elementary transformations, are said to be equivalent. 


SEC. 10.4 


RANK AND THE EQUIVALENCE OF MATRICES 


43 ® 


It is interesting and important to note that any elementary 
transformation involving the rows of a matrix A can be accom- 
plished by premultiplying A by a unit matrix on whose rows 
the same elementary transformation has been performed, and 
any elementary transformation involving the columns of A can 
be accomplished by postmultiplying A by a unit matrix on whose 
columns the same elementary transformation has been performed. 
More specifically, we have the following theorems, whose proofs 
follow immediately from the definition of matric multiplication : 

THEOREM 3 

If A is an arbitrary (m,n) matrix and if M ( N ) is the matrix obtained from the 
identity matrix I m (/„) by multiplying the elements in the ith row (column) by X, 
then the product MA (AN) is identical with A except for the ith row (column ) 
which consists of the elements of the fth row (column) of A each multiplied by X. 

THEOREM 4 

If A is an arbitrary ( m,n ) matrix and if M ( N ) is the matrix obtained from the 
identity matrix I m (/„) by interchanging its fth and jth rows (columns), then the 
product MA (AN) is identical with A except for the zth and y th rows (columns) 
which are interchanged. 

THEOREM 5 

If A is an arbitrary (m,n) matrix and if M (N) is the matrix obtained from I m (/„) 
by adding to the elements of the jth row (column) X times the corresponding 
elements in the ith row (column), then the product MA (AN) is identical with A 
except for the jth row (column) which consists of the elements of the , 7 th row 
(column) of A plus X times the corresponding elements in the ?'th row (column) 
of A. 

Prom the preceding theorems it is clear that a sequence 
of elementary transformations T\, Ti, ... , Tk on the rows 
(columns) of an (m,n) matrix A can be accomplished by premulti- 
plying (postmultiplying) A by a sequence of matrices Mi, M 2 , 
. . . , Mk (Ni,Nz, ... ,Nk) each obtained from the identity 
matrix I m (I n ) by performing upon its rows (columns) the 
same elementary transformations. The product Mk ■ * * M 2 M 1 
(NiN« ■ • • Nk) of the matrices by which A is premultiplied 
(postmultiplied) can, of course, be expressed as a single matrix 
P (Q), necessarily of rank rn (n) since it is the product of matri- 
ces each obtained from I m (I n ) by elementary transformations 
and each therefore of rank in (n). We have thus established the 
following important theorem: 

THEOREM 6 

If A and B are equivalent matrices, then B = PAQ, where P and Q are non- 
singular matrices. 

In view of the way the nonsingular matrices P and <3 of the 
last theorem were obtained, it is interesting to inquire whether, 


440 


DETERMINANTS AND MATRICES 


CHAP. 10 


conversely, any nonsingular matrix can be obtained from the 
corresponding identity matrix by elementary row transformations 
or elementary column transformations. The answer is Yes, as 
the following pair of theorems makes clear: 

THEOREM 7 

Any nonsingular (n,n) matrix can be reduced to the identity matrix I n either by 
elementary row transformations or by elementary column transformations. 

PROOF Let P = ||pij be an arbitrary nonsingular (n,n) matrix. Because P 
is nonsingular, at least one element in the first column must be different from 
zero; and it is no specialization to assume that p n s* 0, for if this is not the case 
the interchange of two rows will bring a nonzero element into the leading position. 
Since pn 9* 0, the leading element in the first column may be reduced to 1 by 
multiplying each of the elements in the first row by l/pn- Then by subtracting 
the appropriate multiple of the first row from each of the other rows w'e obtain 
the matrix 


1 


Tu ‘ 

■ Tin 

0 

r* 2 

r -2 s • 

• Tin 

0 


r n 3 • • 

’ * T nn 


Since the original matrix, and therefore the last one, is nonsingular, it follows 
that the submatrix 


Tn • 

• T in 

Tn 2 • 

' ' T n „ 


is nonsingular. Hence the same reduction can be applied to it, and thus, continuing 
the process sufficiently, we obtain an upper triangular matrix 


1 

$12 

$13 

• * $ln 

0 

1 

$23 

* * $2n 

0 

0 

1 

• * $3» 

0 

0 

0 

. . . 1 


Finally, working upward by row operations similar to those we have just em- 
ployed, the elements above the diagonal can all be reduced to zero. This proves 
the assertion of the theorem for reductions involving only elementary row trans- 
formations. A similar argument shows that P can also be reduced to the identity 
by means of elementary column transformations. Thus the theorem is established. 

THEOREM 8 

Any nonsingular (: n,n ) matrix can be obtained from the identity matrix I n by a 
sequence of elementary row transformations or a sequence of elementary column 
transformations. 

PROOF If R is any nonsingular (n,n) matrix, we know from the last theorem 
that a sequence of elementary row transformations can be found which will 
reduce R to I„. Then, by Theorems 3, 4, and 5, we know that there exist corre- 


SEC. 10.4 


RANK AND THE EQUIVALENCE OF MATRICES 


441 


sponding matrices M i, M 2) . . . , Mk such that 
In = M k • • • M,MiR 

Postmultiplying this equation by R~\ which of course exists since R is non- 
singular, we have 

(1) R~ l = M h ■ ■ • MtMdn 

In other words, if we multiply I n by the matrices corresponding to the elementary 
row operations by which we reduce R to I n , the result is R~ x . Thus, if we have a 
nonsingular matrix P, which we wish to obtain from the corresponding identity 
matrix by a sequence of elementary row transformations, we need only take the 
matrix R in Eq. (1) to be P~ l , since (p-i)- 1 = P; that is, to obtain P we need 
only apply to / the successive row operations by which P -1 is converted into I. 
A similar argument shows that an arbitrary nonsingular matrix P can also be 
obtained from the corresponding identity matrix by a sequence of elementary 
column transformations. 

Incidentally, Eq. (1) provides a method for determining 
the inverse of a matrix R which is of some practical value, since 
the matrices Mi, M 2) . . . , Mk can easily be found from the 
straightforward process of reducing R to the identity matrix. 


EXAMPLE 2 

Find a sequence of elementary row transformations which will reduce the matrix 

1 2 0 II 

P = 2 3-1 

-1 -1 2 1 | 


to J 3 ; determine the matrices Mi, M 2 , , corresponding to these transformations; and use 

these results to compute the inverse of P. 

By inspection it is clear that P can be reduced to I 3 by the following sequence of row 
operations: 







1 

0 

0 



1 

2 


o 

Ti: 

Row 2 — 2 • Row 1 

Mi 

= 

- 

2 


0 

MiP - 


0 

-1 

-1 






0 

0 

1 


- 

1 

-1 


2 





1 

0 

0 



1 


2 

0 


2 V 

Row 3 + Row 1 

M 2 

= 

0 

1 

0 


MiM'iP = 

0 

- 

1 

-1 






1 

0 

1 



0 


1 

2 






1 


0 

0 


1 

2 

0 


2 Y. 

- Row 2 

m 3 

= 

0 

- 

-1 

0 

MzMiMiP = 

0 

1 

1 






0 


0 

1 


0 

1 

2 






1 


0 

0 


1 

2 

0 


ZV 

Row 3 — Row 2 

M t 


0 


1 

0 

MiMtMiMiP = 

0 

1 

1 






0 

- 

-1 

1 


0 

0 

1 






1 


0 

0 


1 

2 

0 


zv 

Row 2 — Row 3 

Mi 

= 

0 


1 

-1 

MiMiMiMiMiP = 

0 

1 

0 






0 


0 

1 


0 

0 

1 






1 

- 

-2 

0 


1 

0 

0 


ZV 

Row 1 - 2 • Row 2 

Mi 

= 

0 


1 

0 

MiMiMiMiMiMiP = 

0 

1 

0 






0 


0 

1 


0 

0 

1 



442 


DETERMINANTS AND MATRICES 


CHAP. 10 


The inverse P~ l of the given matrix can now be found either by using Eq. (1), namely, 

P - 1 - MtMiMiMzM tM il 3 

or simply by performing on J 3 the same sequence of row transformations used to reduce P to I 3 : 

P - 1 - TtTtTtTiTtTih 
The result, by either method, is 

|| —5 4 2 1 

P - 1 = 3 -2 -1 

1 -1 1 1 1 | 

With Theorem 8, we can now prove the converse of Theo- 
rem 6: 

THEOREM 9 

If B — PAQ, where P and Q are nonsingular matrices, then A and B are 
equivalent. 

PROOF By Theorem 8, P can be obtained from the corresponding identity 
matrix by a sequence of elementary row transformations, and Q can be obtained 
from the corresponding identity matrix by a sequence of elementary column 
transformations. Thus, as in the proof of Theorem 8, there is a set of matrices 
Mi, Mt, . . . , Mk, each representing some elementary row operation, such that 

P = M k • • • 

and a set of matrices N h Nt, . . . , Ni, each representing some elementary 
column operation, such that 

Q - NiN t • • • Ni 

Hence, B = PAQ - (M k • • • M 2 M l )A(N x N 2 • • • Ni) 

which proves that B is obtained from A by elementary row and column trans- 
formations and, hence, is equivalent to A, as asserted. 

By a proof almost identical with the proof of Theorem 7, 
we can prove the following theorem: 

THEOREM 10 

Any (m,n) matrix of rank r can be reduced by elementary transformations, which 
in general will involve both rows and columns, to an (m,n) matrix in which an = 1 
(i = 1 , 2, ... , r) and all other elements are zero. 

From Theorem 2 it is clear that equivalent matrices have 
the same rank. In view of Theorem 10, it is clear that, given 
two (m,n) matrices of the same rank, each can be reduced by 
elementary transformations to the same standard form, and, 
hence, each can be reduced to the other, via the standard form, 
by elementary transformations. Thus we have established the 
following important theorem: 


SEC. 10.4 


RANK AND THE EQUIVALENCE OF MATRICES 


443 


THEOREM M 

Two (m,n) matrices are equivalent if and only if they have the same rank. 

The equivalence relation 
B = PAQ P,Q nonsingular 

is a very general one, and many applications involve special 
cases in which P and Q satisfy additional conditions. These can 
all be thought of as transformations of a matrix A into a matrix B, 
and the usual terminology reflects this point of view. The follow- 
ing table summarizes the various cases of particular interest: 


table 10.2 


If P,Q are arbitrary 
nonsingular matrices 

B = P.4 <3 

is an equivalence transformation 
and B is equivalent to A. 

If P - Q-i 

B = Q~KAQ 

is a similarity transformation 
and B is similar to A. 

If P - Q T 

B - Q T AQ 

is a congruence transformation 
and B is congruent to A. 

If P = QT = Q-l 

B = Q T AQ 
= Q- 1 aq 

is an orthogonal transformation 
and B is orthogonally similar to A . 

If P = QT = Q-l 

B m Q^AQ 
- <3 _1 ri<3 

is a unitary transformation 
and B is unitarily similar to A. 


EXERCISES 

1 If a matrix is of rank r, is it possible that for some value p, less than r, all minors of order p 
are equal to zero? Why? 

2 Determine the rank of each of the following matrices: 

b 2 -1 3 

1 -2 3 

5 0 3 

d 12 3 4 

2 3 4 5 

3 4 5 6 

4 5 6 7 

5 6 7 8 

3 Determine the rank of each of the following matrices as a function of X: 

a 8(1 - X) -2 0 11 b || 1 - X 1 1 || 

-2 3 - 2X ~1 1 3 - X 3 

0 -1 2(1 -X) || |j 2 1 4 — X || 

c 5 - X 4 -2 || 

4 5 — X -2 

-2 -2 3 — 2X || 



444 


DETERMINANTS AND MATRICES 


CHAP. 10 


4 If A is an (m,l) matrix and B is a (l,n) matrix, show that the rank of the matrix AB is 1. 
6 Prove Theorem 10. 

6 Prove that the relation of equivalence has the following properties: 
a Every matrix is equivalent to itself, 
b If A is equivalent to B, then B is equivalent to 4. 

c If .4 is equivalent to B and B is equivalent to C, then A is equivalent to C. 

Do the other relations listed in Table 10.2 have these properties? 

T a Work Example 2 using only elementary column transformations. 


b Work Example 2 if P is the matrix 

8 Show that A - | ® ® Jj and B « 

matrices P and Q, such that B — PAQ, 

II 0 1 


Show that A = 


1 12 


are equivalent, and find nonsingular 


110 

0 1 1 are equivalent, and find nonsingular 

2 1 1 


matrices P and Q such that B ~ PAQ. 



1 2 1 0 

| 2 1 

0 

10 Show that A = 

1 2 1 

and B - 1 1 

2 


14 -1 -2 

0 -1 

-4 


singular matrices P and Q such that B - PAQ. 


equivalent, and find non- 


10.5 

Systems of linear equations 

Determinants and matrices find their most important application 
in the study of linear dependence and independence and in the 
closely related problem of the solution of systems of simultaneous 
linear equations: 

DEFINITION 1 

The quantities Qi, Q it . . . , Q n are said to be linearly dependent if there exists a 
set of constants ei, c h ... , c», at least one of which is different from zero, such 
that the equation 

Cd2l + CzQ-z + ' ' • + C. n Q n — 0 

holds identically. 

DEFINITION 2 

The quantities Q h Q 2 , . . . , Q n are said to be linearly independent if they are 
not linearly dependent; i.e., if the only linear equation of the form 

C\Ql + C2Q2 ■+ • • • + C„Qn = 0 
which they satisfy identically has 


Ci = c 2 = • 


C n = 0 


SEC. 10.5 


SYSTEMS OF LINEAR EQUATIONS 


445 


THEOREM 1 

If the quantities Qi, Qz, . . . , Q n are linearly dependent, then, at least one 
(though not necessarily each one) of the quantities can be expressed as a linear 
combination of the remaining ones. 

PROOF Since Q 1,^2, . . . , Q„ are linearly dependent, they necessarily satisfy 
a linear equation of the form C1Q1 + czQz +*..•■+ c n Q n = 0 in which at least 
one of the c’s, say c i} is different from zero. This being the case, we can divide 
by d, getting 

<*--? !«■ 

which expresses Qi as a linear combination of the remaining Q’s, as asserted. 
Since some, though not all, of the c’s may be zero, it follows that we may not be 
able to solve for each of the Q’s in this fashion. 

EXAMPLE i 

Show that the quantities 1, x, and x 2 are linearly independent. 

If 1, x, and x 2 are not linearly independent, they must satisfy identically some linear equa- 
tion of the form 

ci(l) + d(x) + c 3 (x 2 ) = 0 

in which at least one of the c’s is different from zero. However, evaluating this identity for the 
particular values x — —1, 0, 1, we obtain the three equations: 

Cl — Ci 4- c 3 =0 
ci =0 

Ci + Ca + Cs = 0 

and, by inspection, the only solution of this sjrstem is ci = c* = c 3 = 0. Since this contradicts 
the assumption of linear dependence, the given quantities must be linearly independent, as 
asserted. 


EXAMPLE 2 

Show that the vectors 


1 

2 

Vi - 

2 

-1 

V 3 = 

0 

1 

v t - 

4 

-1 

3 


3 


-1 


5 


are linearly dependent. 

These vectors will be linearly dependent if and only if constants ci, c 2 , c 3 , and ci exist such 

that 

a At least one of them is different from zero. 


II 1 2 

0 

4 


0 

2 + CJ -1 

+ c 3 1 

+ C4 -1 

= 

0 

3 3 

-1 

5 


0 


Condition b is, of course, equivalent to the three scalar equations: 
ci + 2 c 2 + 4c4 = 0 

2ci — Ci + Ci — c 4 = 0 
3ci -f- 3c 2 — c 3 + 5 c4 = 0 

and it is not difficult to verify that these are satisfied by the values 

c* = X C 4 = —X X arbitrary 


ci = 0 c« = 2X 


446 


DETERMINANTS AND MATRICES 


CHAP. 10 


and by no others. Hence the four vectors are linearly dependent and, in fact, are connected by 
the relation 

OFi + 2V S + V 3 ~ =* 0 

and (except for constant multiples of this) no others. From this it is obvious that F 2 , V 3 , and F< 
can each be expressed in terms of the remaining vectors of the set, but that Fj cannot be so 
expressed.* 

On the other hand, the vectors Fi, F 2 , and F 3 are linearly independent, since an equation of 
the form C 1 F 1 + c 2 F 2 + c 3 F 3 = 0, that is, 


1 


2 


0 


0 

2 

3 

+ c s 

-1 

3 

+ c 3 

1 

-1 


0 

0 


implies that 

ci + 2cj == 0 

2ci — cs 4- c 3 = 0 
3ci + 3c s — C 3 — 0 

and by direct solution we find that the only values which satisfy this system of equations are 
ci = c 2 = c 3 = 0. Similarly we can verify that Fj,.F», and V t are independent and that Fi, Fa, 
and F 4 are independent. However Vt, F 3 , and F 4 are dependent, since, as we observed above, they 
satisfy the relation 2Fs + V 3 — F< = 0. 

As a simple application of the notion of linear independence, 
we have the following useful result: 


THEOREM 2 

If Vi, Vi, . . . , V m are m vectors each having n S m components and if for 
i = 1, 2, . . . , m the first (the last) nonzero component of F* is the zth [the 
(n — m 4* i) th], then Fi, Fj, . . . , F m are linearly independent. 

PROOF By hypothesis, the given vectors are of the form 


rn 


0 


0 


0 

*>21 


Vn 


0 


0 

«31 

f 2 = 

Vn 

F 3 - 

Vn 

V m = 

0 

V m i 


Vmi 


Vmi 


Vmm 

Vnl 


Vni 


Vfi3 


Vnm 


where vu, v^, vn, • • . , v mm are all different from zero. Now, since each vector 
has n components, the condition c x Fi + c 2 F 2 + • • • + c m V m — 0 implies n 
scalar equations, the first m of which are 

C1V11 = 0 

CiVn -f- c 2 r 22 =0 

C1V31 + c»vn +• c 3 Vn = 0 


Cll'ml + CiV* 2 + CzV m 3 ' + CmV mm - 0 


* Of course it is possible for a set of dependent quantities Q h Qt, . ... , Q n 
to satisfy more than one independent linear equation. In problems where this 
occurs, it may well be that some of the equations can be solved for Q { , say, 
while the others cannot. Naturally, if even one equation can be solved for Q i} 
then Qi can be expressed in terms of the other members of the set. 


SEC. 10.5 


SYSTEMS OF LINEAR EQUATIONS 


447 


Hence, since vu, Vu, V 33 , . . . , v mm are all different from zero, it follows that 
Cl == c 2 = C 3 = • • ■ = c m = 0. Therefore the vectors V h V 2 , V 3 , .... V m are 
linearly independent, as asserted. A similar argument establishes the parenthetical 
assertion of the theorem. 

From Examples 1 and 2, it is clear that questions concerning 
linear dependence and independence are closely related to the 
solution of systems of simultaneous linear equations, and to 
these we now turn our attention. In the most general case we 
have a system of the form 

#llXl + #122:2 + ' ‘ • + # ln%n = hi 

( 1 ) a 2 iXi +• 022X2 4 - • • • 4 - o 2 n x„ = b 2 


( 2 ) 


a m iXi 4- a m 2 x 2 4- • • • 4- a mn x n = b m 

where m, the number of equations, is not necessarily equal to n, 
the number of unknowns. If at least one of the m quantities 6j is 
different from zero, the system is said to be nonhomogeneous. 
If bi — 0 for all values of i, the system is said to be homogeneous. 


If we define the matrices 



#12 * ‘ ‘ #ln 

A - 

#21 

#22 ' ' ‘ #2n 


#ml 

#wi2 ‘ ' ‘ #m« 


the system can be written in the compact matric form 
AX — B 

In this form, the matrix A is known as the coefficient matrix of 
the system and the matrix 

IIABII 


obtained by adjoining the column matrix B to the coefficient 
matrix A is known as the augmented matrix of the system. 

Before proceeding to the question of the existence and deter- 
mination of solutions of (2), we shall first prove several important 
theorems about such solutions on the assumption that they exist:* 


THEOREM 3 

If Xi and X» are two solution vectors of the homogeneous matric equation 
AX = 0 , then, for all values of the scalar constants ci and c 2 , the vector c a Xi 4 - 
c 2 X 2 is also a solution of AX = 0. 


PROOF By direct substitution we have 

A(ciXi + C 2 X 2 ) = A(dXi) + A(c 2X 2 ) 
- ci(AXi) 4- c 2 (AX 2 ) 

= cj • 0 4" c 2 ■ 0 
= 0 


* It is interesting to note the striking resemblance between the next three 
theorems and Theorems 1, 2, 3 of Sec. 2.1 and Theorems 1, 2, 3 of Sec. 4.5. 


448 


DETERMINANTS AND MATRICES 


CHAP. 10 


where the coefficients of a and c 2 vanish because, by hypothesis, both Xi and X 2 
are solutions of AX — 0. Hence, c%X i + c 2 X 2 also satisfies AX — 0, as asserted. 

THEOREM 4 

If k is the maximum number of linearly independent solution vectors of the sys- 
tem AX - 0 and if X h X 2 , . . . , Xk are k particular linearly independent 
solution vectors, then any solution vector of AX — 0 can be expressed in the 
form 

ciXi 4* c 2 X 2 + • • • + c k Xk 
where the c’s are scalar constants. 

PROOF Let k be the maximum number of linearly independent solution 
vectors of the equation AX = 0; let Xi, X 2 , . . . , X* be a particular set of k 
linearly independent solution vectors; and let X* + i be any solution vector. If 
X fc+ i is one of the vectors in the set {Xi,X 2 , . > ... ,X k \ the assertion of the theo- 
rem is obviously true. If X*+i is not a member of the set {Xi,X 2 , . . . ,Xk) then 
Xi, X 2) . . . , Xk, X k +i cannot be linearly independent, since, by hypothesis, k is 
the maximum number of linearly independent solution vectors of AX — 0. 
Hence, the X’s must satisfy a linear equation of the form 

(3) ciXi + c 2 X 2 +.*■■• 4* ^Xk -j- Ci+iXi+i = 0 

in which at least one c is different from zero. In fact, ci : +i 0, for otherwise 
Eq. (3) would reduce to 

CiXi -f- c 2 X 2 -{- * ’ * 4* CkXk = 0 

with at least one of the c’s different from zero, contrary to the hypothesis that 
Xi, X 2 , . . . , X k are linearly independent. But if c k+ i 9^ 0, it is clearly possible 
to solve Eq. (3) for X* + i and express it in the form asserted by the theorem. 

Because of the property guaranteed by the last theorem, a 
general linear combination of the maximum number of linearly 
independent solution vectors of AX — 0 is usually referred to 
as a complete solution of AX = 0. 

THEOREM 5 

If X p is a particular solution vector of the nonhomogeneous system AX — B 
and if cjX i 4- c 2 X 2 4- ■ • * 4- c k X k is a complete solution of the related homo- 
geneous system AX — 0, then any solution of the nonhomogeneous system can 
be written in the form 

CiXi 4* CiXz 4* * ■ ■ 4* CkXk 4" X p 

PROOF Let X p be a particular solution vector of the nonhomogeneous equa- 
tion AX - B and let X a be any solution vector of this equation. Then AX P = B 
and AX a = B, and, subtracting these two equations, we have 

AX a - AX P = 0 or A(X„ - X p ) = 0 

Now the last equation shows that X a — X p is a solution of the homogeneous 


SEC, 10.5 


SYSTEMS OF UNEAR EQUATIONS 


449 


equation AX — 0. Hence, by Theorem 4, it can be expressed in the form 
X a — X p = C\Xi + C 2 X 2 4 ~ ■ ‘ + cnXk 

Therefore, transposing, 

X a = C\X\ + C2X0 4- • * • 4 * CkXk + X v 

Since X a was any solution vector of the equation AX — B, the theorem is 
established. 

We now turn our attention to the question of when solutions 
of the equation AX — B will actually exist. The central result 
is contained in the following theorem: 

THEOREM 6 

A system of m simultaneous linear equations in n unknowns, AX — B, has a 
solution if and only if the coefficient matrix A and the augmented matrix \\AB\\ 
have the same rank r. When solutions exist, the maximum number of independent 
arbitrary constants in any general solution, i.e., the maximum number of linearly 
independent solution vectors of the related homogeneous system A X = 0, is n — r. 

PROOF We shall prove this theorem by applying to the given system 

a-nXi + anxz + • • • + ai n x„ - b 1 

021^1 + ® 22#2 4 " ' * ‘ 4 " = &2 


Uml^l 4“ ClmiXl 4- • * * 4- (lmn%n = 

a procedure, known as the Gauss reduction, which resembles closely the method 
by which we proved Theorem 7, Sec. 10.4. We begin by assuming that an 0, 
which is no specialization, since at least one of the coefficients in the first equation 
must be different from zero, and, by renaming the unknowns, if necessary, it can 
be brought into the leading position. Next we divide the first equation by a u 
and then multiply it in turn by 021, a?!, . ... , a m i and subtract it from the second, 
third, . . . , mth equation. This gives the equivalent* system 


%1 4“ au%2 4* ai3#3 + • ' 

* 4- ain%n = pi 

a zz X2 4* a 23 a:3 4- * 

4" o, 2n x n — 62 

o! n Xt 4" a'nZz 4- ■ ’ 

• 4- a' tn x n = 63 

a m 2%i 4- - 

• + <*£»*» = b' m 


where, explicitly, 


01 , d\) 

-- - — a H = aij — Oil — — 

On an 


■1~ 

«11 


Now we apply the same process to the last m — 1 equations, noting that, if 
«22 = 0, a renaming of the last n — 1 unknowns with possibly a rearrangement 
of the last m — 1 equations will introduce a nonzero coefficient in place of 022 
unless all coefficients in the remaining equations are zero, which, of course, may 


* Two equations or systems of equations aTe said to be equivalent if every 
solution of one is a solution of the other, and conversely. 


450 


DETERMINANTS AND MATRICES 


CHAP. 10 


be the case at some stage in the process. The result of this second reduction is 
the system 

Xi + 0 : 12 X 2 + 0 : 13 X 3 + ‘ ' ’ 4~ ai»x„ = 0i 

X2 4* 0:23X3 + ' • ‘ + o: 2a x„ = 0 z 

U33X3 + • • * 4* dz n x n = b s 

®43®3 4" ■ ' ’ 4“ ®4»X» = ^4 


a" 3 x 3 + • * v 4- (C n x n - b” 

We now continue in exactly the same fashion until the process terminates. If 
m < n, this may happen because after m applications there are no more equations 
to which to apply it: 

Xi 4* 0 : 12 X 2 + OH 3 X 3 +••'•+ CilmXm + CKl.m+lXm+l + * ’ ' 4~ «lnX n = 01 

Xz 4“ 0:53X3 -f • ••+ ClZmXm + lX m+ l + *. * '■.* 4" O^nXTi = 01 

Xs 4" * * • + OrsmXm + 0!3,)n+lXnt+l 4“ ‘ * ' 4" « 3n X n = 03 


Xm, 4” OJjn.m+lXm+l 4“ 4" CXmnXn 0m 

On the other hand, regardless of the relative size of m and n, the process may 
terminate because before we have made m reductions, say after only k ( < m) 
reductions, all coefficients in the left member of each of the remaining m — k 
equations are zero : 


Xi 4- 0 : 12 X 2 4* 013 X 3 4" ' 

■ 4" oiikXk 4~ ' 

■ ■ 4- 

. . 4 - . 

* • 4- ai n Xn = 01 

xz 4- 0 : 23 X 3 4- • 

■ 4" azkXk 4~ ■ 

• • 4- 

. . 4 - . 

■ ' 4“ « 2 n X n = 02 

x 3 4- • 

■ 4" <x3kXk 4- ■ 

• • 4- 

• • 4- • 

' 1 4“ « 3 nX» = 03 


.+...+ . 

• • 4- 

• • 4- • 

• • 4 - 


Xk + ■ 

• • + 

• • 4- • 

• • 4“ OtknXn - 0k 



04- 

• • 4- • 

• ■ 4- 0 = /3*+i 




. . 4 . . 

• * 4- 




04- • 

• • 4- 0 = 0 m 


In the first case, if we transpose all terms containing x ro+ i, x m+i , . . . , x„, 
we have a system of equations from which x m , x m _i, x m _ 2 , . . . , Xz, Xi can suc- 
cessively be found in terms of x m+ i, x m+2 , . . . , x n , which can be given arbitrary 
values. In the second case it may be that 

0k+l — 0k+z — ’ ‘ * = 0m — 0 

so that we have essentially the case we have just discussed, except that now it is 
xi, x 2 , . . . , Xk which are expressed in terms of the remaining unknowns xt+i, 
Xk+ 1, . . . , x„, which can be given arbitrary values. If, however, one or more of 
the 0 ’s after 0 * is different from zero, then we have a contradiction, and the 
original system has no solution, or, in other words, is inconsistent. 

Now consider the coefficient matrix and the augmented matrix of the reduced 
system. Since the augmented matrix contains the coefficient matrix, it is clear 
that its rank must be at least as great as the rank of the coefficient matrix. More- 
over, in the solvable cases it is evident from the definition of rank that the rank 
of the augmented matrix cannot exceed the rank of the coefficient matrix and, 
hence, must be equal to it. Likewise, it is obvious that in the inconsistent case 
the rank of the augmented matrix is actually greater than the rank of the coeffi- 
cient matrix. Finally, we observe that the ranks of the coefficient matrix and the 



SEC. 10.5 


SYSTEMS OF LINEAR EQUATIONS 


451 


augmented matrix for the reduced system ai'e equal to the ranks of the respective 
matrices of the original system, since each step in the Gauss reduction, namely, 
rearranging the columns of unknowns, rearranging the equations, multiplying 
and dividing the equations by nonzero constants, and subtracting multiples of 
one equation from other equations, is an elementary transformation, which, by 
Theorem 2, Sec. 10.4, cannot change the rank of either matrix. Thus we have 
established the first assertion of the theorem. 

Let us now return to the reduced system in the solvable case. If r is the com- 
mon value of the rank of the coefficient matrix and the augmented matrix, the 


reduced system can be written 

Xi + OI12X2 + 0:13X3 + • • • + OtlrXr — — ai,r+l#r+l — • • • — Oil n X n + 

X 2 -f an%3 +••*.+ o: 2r X r = — a 2 , r+ iX r +i — • • • — a 2 „X„ 4" /3 2 


X T — — 0:r,r+xX r+1 — • • • — ct rn X n + 0 r 

In this form it is clear that x r +i, x r+2 , . . . , x n can be given arbitrary values, say 
Xr+l — Xi, Xr+2 — X 2 , . . . , X n — X n — r 

Substituting these values into the last equation in the above system, we obtain 
x r immediately. Then, substituting for x r , x r +i, . . . , x H in the next to the last 
equation, we obtain x r _i, and so on, step by step until each of the x’s is determined. 
Finally, when the expressions for xi, a: 2 , . . . , x r are simplified by collecting 
terms on Xi, X 2 , . . . , X„_ r we obtain expressions of the form 

Xi — XlCli + X 2 Cl2 + • • • + X„_ r Ci,n_r 4* 7l 


X r — XlCrl + X 2 C r2 + ’ ’ * + X„_rCr,„-r 4* 7 r 
X r +l - X X 
X r + 2 — X 2 


Xi 


Cn 


C 12 


Ci,„-r 


7i 

Xr 


Crl 


Cr2 


Cr.n-r 


7r 

Xr+1 

— Xi 

1 

4- X 2 

0 

+ • • ' 4- X„_r 

0 

+ 

0 

X r ^-2 


0 


1 


0 


0 

X m 


0 


0 


1 


0 


The n — r vectors which are multiplied respectively by Xi, X 2 , . . . , X„_ r depend 
only on the a,/s in the original system and are, in fact, solution vectors of the 
related homogeneous system AX — 0. The vector 


452 


DETERMINANTS AND MATRICES 


CHAR. 10 


depends not only on the au/s but also on the 6/s and is clearly a particular solution 
of the nonhomogen eous system AX — B. By Theorem 8, the n — r vectors which 
are multiplied by the X’s are linearly independent. Hence, the related homogeneous 
system AX — 0 has n — r linearly independent solutions, and the complete solu- 
tion of the nonhomogeneous system AX = B contains n — r independent arbi- 
trary constants, as asserted. 


EXAMPLE 3 

Find a complete solution of the system 

+ 2 x s 4 x 3 — x* + 2 x 5 ~ 2 

x t + 4x a 4 5 x 3 - 3 x 4 4 8 x s - - 2 
— 2xi — x a + 4x» — X 4 + 5x 6 = —10 
3xi 4 7x a 4 5xs — 4xi + 9x 6 =» 4 

Applying the Gauss reduction, we obtain successively 
Xi 4 2x a 4 x 3 — x< 4 2x 5 * 2 

2x a 4 4x 3 — 2 x 4 4 6 x 5 = —4 
3xj 4 6 x 3 — 3x4 + 9xs = —6 
x s + 2x 9 — X 4 + 3x 5 ® —2 
Hence, we have the solutions 

Xi = —2xa — x 3 4 x 4 — 2 x 5 4 2 
X3 *= — 2xa 4 x< — 3x5 — 2 


and 


xi 4 2x 2 4 x 3 - x< 4 2xs * 
xs + 2x 3 — X 4 + 3x 6 = 


x 3 » 


Xt ** x 6 

or, taking x 3 «* Xt, x< =» X2, x 6 = X 3 , 

xi = 3Xi — X 2 4 4X 3 4 6 
Xs *= — 2Xj 4 x 3 — 3X 3 — 2 
x 3 = Xi 
X4 = X a 

X5 = x 3 

A complete solution of the original system is, therefore, 


3 


-1 


4 


6 

-2 


1 


-3 


-2 

1 

4 X 2 

0 

4 X 3 

0 

4 

0 

0 


1 


0 


0 

0 


0 


1 


0 


where Xi, X 2 , X 3 are arbitrary scalars. 

The existence of a solution for the nonhomogeneous system implies that the coefficient 
matrix and the augmented matrix 


1 

2 

1 -1 

2 

1 

2 

1 -1 

2 

2 

1 

4 

5 -3 

8 

and ||Ai?|| = 1 

4 

5 -3 

8 

—2 

—2 

— 1 

4 -1 

5 

-2 

-1 

4 -1 

5 

-10 

3 

7 

5 -4 

9 

3 

7 

5 -4 

9 

4 


SEC. 10.5 


SYSTEMS OF UNEAR EQUATIONS 


453 


have the same rank. The fact that the complete solution contains three arbitrary constants 
implies that the common value of the rank of the two matrices is 2, since, according to the last 
theorem, 

(Number of arbitrary constants in complete solution) 

= (number of unknowns) — (common value of rank) 


It is, of course, not difficult to verify that A and ||AZ?H are both of rank 2. The vector 



is a particular solution of the given nonhomogeneous equation AX — B. The vectors 


3 


-1 


4 

-2 


1 


-3 

1 

X, = 

0 

and Xs = 

0 

0 


1 


0 

0 


0 


1 


are three linearly independent solutions of the related homogeneous equation AX = 0. That 
\iXi + + X 3 X 3 + Xp is a complete solution of the given system follows, of course, from 

Theorem 6. 


As the last example illustrated, the Gauss reduction provides 
a practical method for solving systems of simultaneous linear 
equations in the general case. However, in several important 
special cases there are other methods which are sometimes more 
convenient. Specifically, we have the following pair of theorems : 


THEOREM 7 (CRAMER’S RULE) 

If the coefficient matrix A of a system AX = B of n linear equations in n un- 
knowns is nonsingular, the system has the unique solution 


Xl 


_ m 

~w 


Xi 


~ MM 


. m 
'W 


where Di is the matrix obtained from A by replacing the ith column of A by the 
column vector B. 


PROOF Since the matrix of coefficients A is nonsingular by hypothesis, it 
follows that its inverse A " 1 exists. Hence, premultiplying the given equation 
AX = B by A" 1 , we obtain 

X = A - 1 jB 

and direct substitution confirms that this actually is a solution. Now, in the 
column vector A~ 1 B, the element in the z'th row is simply the scalar product of 
the i th row vector of A -1 and B itself, that is, 

Ai;6i + A 2 ib 2 + • • • + A„tbn 

pq 

Moreover, the numerator of this fraction is just the expansion, in terms of the 
ith column, of the determinant of the matrix Di obtained from A by replacing 


45 4 


DETERMINANTS AND MATRICES 


CHAP, 10 


the fth column of A by the column vector B. Hence 


(4) 


-pr 


as asserted. 


Since A is nonsingular, the rank of the coefficient matrix and the rank of the 
augmented matrix are both equal to n. Hence, according to Theorem 6, there 
can be no arbitrary constant in any complete solution, and the solution we have 
found is the only one. 


If in the (n,n) system AX — B the vector B is zero, that is, 
if bi = & 2 = * ' • — b n — 0, then, clearly, each determinant 
| Di\ contains a column consisting entirely of zeros and hence is 
zero. If 1*4 1 0, it therefore follows from (4) that x% ~ 0 for all 

values of i, or, in other words, that only a trivial solution, is 
possible. On the other hand, if B — 0, the coefficient matrix and 
the augmented matrix have the same rank, and if |A| =0 the 
common value of these ranks is at most r — n — 1; that is, 
r ^ n — 1. Hence n — r is at least as much as 1, and, by Theorem 
6, the equation AX = 0 has at least one nontrivial solution 
vector. Thus we have established the following important 
corollary of Theorem 7 : 


COROLLARY 1 

A homogeneous system of n linear equations in n unknowns AX — 0 has a non- 
trivial solution, i.e., a solution other than X\ — xn — ■ • • = x n — 0, if and only 
if the determinant of the coefficients |A | is equal to zero. 

More specifically, when the rank of the coefficient matrix 
of a homogeneous system of n linear equations in n unknowns is 
w — 1, we have the following useful result: 


THEOREM 8 

If the coefficient matrix of a homogeneous system of n linear equations in n 
unknowns AX = 0 is of rank n — 1 and if the submatrix obtained from A by 
omitting the fcth row is also of rank n — 1, then a complete solution of the given 
system is 

Xi - cAh i - 1, 2, . . . , n 
where c is an arbitrary constant. 

PROOF Since the rank of the (n — 1, n) matrix remaining when the feth row 
is deleted from A is n — 1, it follows that at least one of the cofactors Ah of the 
Mh row is different from zero. Hence, not all of the values Xi = cAu are zero. 
To verify that these values do indeed satisfy AX = O, we need only substitute 
them into the general equation of the system, namely, 

J ajiXi = 0 j = 1, 2, ... ,n 

and verify that it is satisfied. Doing this, and using Corollary 1, Theorem 11, 


SIC. 1 0.5 


SYSTEMS OF LINEAR EQUAT80NS 


455 


Sec. 10.1, to simplify the result, we have 

X aii(cAfci) = c .2 ajiAki = { °\A\ J j = k 

Thus, since | A | =0, by hypothesis, it follows that each equation is satisfied by 
the given values. Finally, since the rank of both the coefficient matrix and the 
augmented matrix is r = n — 1, it follows by Theorem 7 that the system has 
just one independent solution vector. Hence the solution given by the formula 
of the theorem is a complete solution, as asserted. 


EXAMPLE 4 


Find a complete solution of the system 

Xi - 2x» + x 3 + 3x 4 = 0 
2xi + 2 x 2 — x 3 + x 4 = 0 
— xi — x 3 + 3x 3 + 2 x 4 — 0 
Xi — 8x a — x a + 3 x 4 = 0 


It is easy to verify that the determinant of the coefficients of this system, |.4 1, is zero but that 
the determinant of the (3,3) submatrix in the upper left hand corner of A is different from zero. 
Thus the coefficient matrix A and the submatrix remaining when the last row is deleted from A 
are both of rank 3. Hence, according to the last theorem, the values of x which satisfy the given 
system of equations are proportional to the eofactors of the last row in |.4|. Thus we have 


or, setting 5c = k, 


= 20c 


-5c 


In view of Theorems 7 and 8, it is clearly desirable to have 
some convenient criterion for determining when a determinant 
is different from zero. One useful one is provided by the following 
theorem, whose proof is an interesting application of Corollary 
1, Theorem 7: 


THEOREM 9 

If in each row of a determinant the absolute value of the element on the principal 
diagonal is greater than the sum of the absolute values of the remaining elements 
in that row, the value of the determinant is different from zero. 

PROOF Let 1 A j be an n X n determinant in which, for each value of i, 

( 5 ) k*l > Y Kit 

/- 1 
jVi 


f This property of the absolute values of the elements of a determinant is 
known as diagonal dominance and is quite important in the solution of 
systems of linear equations by iterative methods. 


456 


DETERMINANTS AND MATRICES 


CHAP. 10 


To prove that j.4 1 is not equal to zero, let us assume the contrary. Now, if |4L | = 0, 
then by Corollary 1, Theorem 7, the equations 

aniCl + ClizXi + * • ■ + dlnXn ~ 0 

(6) 

Q>nl£\ 4" Cn2®2 4“ ' ' ’ 4* a nn X n — 0 

have a nontrivial solution. Let a:* be the component of maximum absolute value 
of such a solution vector. Then from the /cth equation of the system (6) we have, 
by transposing, 

= ~~ X Ojtpj 

5 = 1 

Hence, taking absolute values, 

Wk\ ' M £ X lajfeil ' M ^ X M ' I**| 

5 = 1 5=1 

5V* }^k 

and therefore, dividing out (#4, 

ws|w 

jVfe 

But this contradicts some one of the inequalities (5), which hold, by hypothesis. 
Therefore we must abandon the assumption that |4.| =0, and the theorem is 
established. 

If a set of vectors has the property that it contains at least 
one subset of r vectors which are linearly independent, but all 
r 4- 1 vectors are linearly dependent, the set is said to be of 
dimension r. Using Theorem 6, it is not difficult to prove the 
following theorem about the dimension of a set of vectors: 


THEOREM 10 

A set of row (column) vectors Xi, X 2 , . . . , X„ is of dimension r if and only if 
the matrix 


X 2 
Xn 


(IIX.X, • • • XJI) 


is of rank r. 


COROLLARY 1 

If X h Xu, . . . , X n are n linearly independent vectors each having n compo- 
nents, then any vector B with n components can be expressed as a linear com- 
bination of the X’s. 

In Sec. 10.4 we observed that what we called simply the 
rank of a matrix is sometimes referred to as the determinant 
rank. This permits one to distinguish between the rank, as we 
defined it, and the row rank and column rank, which are, by 
definition, the maximum number of linearly independent row 


SEC. 10.5 


SYSTEMS OF LINEAR EQUATIONS 


457 


and column vectors, respectively, in the matrix. However, the 
necessity for distinguishing between the three definitions of 
rank is eliminated by the following theorem, whose proof follows 
immediately from Theorem 10: 

THEOREM 11 

For any matrix A, the determinant rank, the row rank, and the column rank 
are all equal. 

Another interesting consequence of Theorem 10 is contained 
in the following theorem: 

THEOREM 12 

If A and B are conformable matrices of rank r and p, respectively, the rank of 
the product AB is equal to or less than the smaller of the numbers r and p. 

PROOF Let A be an ( m,p ) matrix, let B be a (p,w) matrix, and let the rows 
of A be A i, A 2, . . . , A m . Then the ?th row vector in the product AB (see 
Exercise 7, Sec. 10.2) is 

(7) AiB 

Now, by hypothesis, the rank of A is r. Hence, by Theorem 10, A contains exactly 
r linearly independent row vectors, which, without loss of generality, we can 
take to be the first r, namely, A h A 2 , . . . , A r . Hence, for i — r + 1, r + 2, 
. . . , m, the row A; must be expressible as a linear combination of the first r 
rows: 

Ai = Xl,Ai -J- X2«A 2 ■ *4* XriAr i = V -j“ 1, J* — 2, . . . , m 

Therefore, substituting into (7), we find that, for i = r + 1, r + 2, . . . , m, 
the zth row vector of the product AB is 

(XiiAj -(- X 21 A 2 + * ' * d - X r t A r )B = X i,A \B ^aAiB -f- * * • -}* X rt A r S 

But this shows that each row of AB after the rth is a linear combination of the 
first r rows, which, in turn, proves that AB contains at most r linearly independent 
row vectors and, hence, is of rank at most r. A similar argument, using a column 
partition of B, shows that the rank of AB is at most equal to p. Therefore, the 
rank of AB is at most equal to the smaller of the numbers r and p, as asserted. 
If A and B are also conformable in the order BA, it is clear that the rank of BA 
is also equal to or less than the smaller of the pair (r,p). 

The estimate for the rank of the product AB provided by 
the last theorem can be supplemented with the following result:* 

THEOREM 13 

If A is an (m,p) matrix of rank r and if B is a (p,n) matrix of rank p, the rank 
of the product AB is equal to or greater than r + p — p. 


* Both this result and Theorem 12 are due to the English mathematician 
J. J. Sylvester (1814-1897) and are known together as Sylvester’s law of 
nullity. A proof of Theorem 13 can be found in L. Mirsky, "Linear Algebra,” 
p. 162, Oxford Book Company, Inc., New York, 1955. 




DETERMINANTS AND MATRICES CHAP. 10 

As we shall see in later sections, a set of vectors is usually 
much more convenient to work with if the vectors, in addition to 
being linearly independent, are also orthonormal, that is, are of 
unit length and mutually orthogonal. A general set of r linearly 
independent vectors will ordinarily not possess the property of 
orthonormality, but, by an important procedure known as the 
Schmidt orthogonalization process, it is always possible to 
determine linear combinations of r linearly independent vectors 
which will be orthonormal as well as independent. Let Fi, F 2 , 

. . . , F„ be n linearly independent vectors, and let us choose 
any one of them, say V h and reduce it to a unit vector by divid- 
ing it by its length ^/Vi T V\. This gives us the first vector of our 
orthonormal set: 

Ul = - £l-. r 

VvTVi 

We now choose any member of the original set except Fi, say 
F 2 , and write 

W 2 = F 2 - C1U1 

where ci is a constant to be determined so that W 2 is orthogonal 
to U i. This, of course, requires that 

Ui t W 2 = Ui t (V 2 - aUi) = 0 

From this, since Ui T Ui — 1 , we have 

ci - Ui t V 2 and W 2 = F 2 - (Ui t V 2 )Ui 

We now convert W 2 to a unit vector by dividing it by its length, 
getting 

TTn _ W 2 F 2 - (U l T V i )U 1 

VW 2 t W 2 V[V 2 - (umihnv* - ( U 1 T V i )Uil 

Next we choose any member of the original set except Fi and 
F 2 , say F 3 , and write 

w 3 = F 3 - dtUi - d 2 V 2 

where d x and d 2 are constants to be determined so that W s will 
be orthogonal to both U 1 and U 2 . This gives us the two conditions 

U X T W 3 = Ui T (V s - dilh - d 2 U 2 ) = lh T V 3 - dt - 0 
U 2 t W 3 « U 2 T (V 3 - diUi - d 2 U 2 ) = U 2 T V 3 - d 2 = 0 
Hence, 

d 1 =Ui T V 3 and d 2 = U 2 T V 3 



and, therefore, 

W z = F 3 - (Ui t V 3 )Ux - (U 2 T V 3 )Ui 



SEC. 10.S 


SYSTEMS OF UNEAR EQUATIONS 


459 


TF 3 is now normalized, giving us our third unit vector U 3 ’, and 
the process is continued until the required set of orthonormal 
vectors is obtained from the original set. It is clear that the 
process can fail if and only if at some stage 

k - 1 

W k = V k - 2 (Ui T V k )Ui = 0 
However, if TH" fc = 0, this implies that 

v k = Y 

which, replacing the U’s by their expressions in terms of Fi, F 2 , 

. . . , Ft-!, implies that F& is either zero or else a linear com- 
bination of the preceding F’s. Each of these contradicts the 
hypothesis that the F’s are linearly independent, and hence 
cannot happen. An almost identical argument (or the result of 
Exercise 14) shows that the 17’ s derived by the Schmidt process 
are linearly independent. 

EXERCISES 

1 Verify that sin x and cos x are linearly independent. 

2 Are cos 2 x, sin 2 x, and cos 2x linearly independent? Why? 

3 Show that, if 0 is included in a set of quantities, the members of the set are always linearly 
dependent. 

4 Show that, if the quantities Qi, Qa, . . . , Q n are linearly independent, the members of every 
subset of the Q’s are also linearly independent. Is the converse true? 

6 If A is a square matrix and if the equation AX — 0 has k linearly independent solution 
vectors, show that the same is true of the system A T X = 0. Is this result true if A is not a 
square matrix? 

6 Show that five or more 2X2 matrices are always linearly dependent. How many 3X3 
matrices must we have before we can be sure they are linearly dependent? Why? 

7 Verify that the vectors 



1 


1 2 

-1 

Ah = 

1 

X 2 - 

-1 X 3 = 1 

Xi = 4 


0 


1 3 

-5 


are linearly dependent, and express each of the vectors as a linear combination of the other 
three. 

II 1 2 II II 2 3 II 

8 What conditions must a, b, c, and d satisfy in order that the matrices q M — 2 1| 

and || * ^ ! be linearly dependent? 

9 Using the Gauss reduction, find a complete solution of each of the following systems: 

a Xi + 2x2 + 4x3 — Xi + 2x5 = 3 * b Xi 4- x» + x 3 — x\ — x 5 = 2 

3xi + 4x 2 + 5xs — Xi — 2x 6 = 7 xi + 2x 2 + 4x« — x< + 5x& = 3 

Xi + 3x» + 4x3 + 3x4 — Xs = 4 3xi + 4x 2 + 5x3 — ** — 2x 5 — 7 

Xi + 3x 2 + 4x a + 5x« — x 5 = 4 

2xi + 5x 2 + 8x3 + 4x4 + x& = 7 

Xi — X2 — 2x 3 — 7x4 — x 5 = 0 


460 


DETERMINANTS AND MATRICES 


CHAP. 10 


10 


11 


12 


13 

14 

15 


16 


17 

18 

19 

20 


21 


22 


28 


H 


Using Cramer’s rule, solve each of the following systems: 

a xi — xa + 2x$ = —5 b xi — x 2 + 2ai3 + au — —5 

— xt + 3xs = 0 —xi + 3aj s + 2xt ~ 0 

2xi+xa = 1 2xi + x 2 — Xi = 1 

2a:i -f- 2a?a 4- £3 + 3x< — — 1 

Using Theorem 8, solve each of the following systems: 

a Xi — 2xj + 3a:s = 0 b xi — 2xz + xi — 3$* = 0 

2xi + 3xt — Xa «* 0 2a;x + xa — 3:E3 + Xt — 0 

4a:i — xa + 5xj => 0 3a:i + 3® 2 — 2z s 4- X* = 0 

Determine the values of X, if any, for which each of the following systems has a nontrivial 
solution, and find such solutions when they exist: 

a Xa:j — 2 s3 + %> = 0 b (5 — X)ati + 4^2 — 2xs = 0 

Xa?i + (1 — X)a ?2 4* at} — 0 4a:i 4" (5 — X):C 2 — 2a:s = 0 

2xi — x s + 2Xa;» = 0 — 2au — 2xa + (3 — 2X)a s = 0 

If the rows of a matrix are linearly dependent, are the columns necessarily linearly dependent? 
Show that, if the vectors of a set are mutually orthogonal, they are linearly independent. 
Show that, if r is the maximum number of linearly independent quantities in the set 
Qu Qs, ■ . ■ , Qn and if Q i, Qa, ... ,Qr are linearly independent, then Q r +\, Qr+i, 
can each be expressed as a linear combination of Qi, Qa, ... , Q r . 

If the quantities Q h Qa, ... ,Q» are such that Q r + 1, Qr+n, . ■ ■ , Qn can each be expressed 
as a linear combination of Qi, Qs, ... , Q n show that at most r of the Q’s can be linearly 
independent. 

Show that any (3,4) matrix of rank 2 can be written as the product of a (3,2) matrix of rank 
2 and a (2,4) matrix of rank 2. Of what general theorem do you think this is a special case? 
Prove Theorem 10. 

Prove Corollary 1, Theorem 10. 

Using the Schmidt process, construct a set of orthonormal vectors from the vectors in each 
of the following sets: 


II 1,2,211 

Vt - ||1, 4, 0|| 

V, - ||2,0,1 1| 

111, 1, oil 

v 2 - 111,0,1)1 

- ||0,1,1 1| 

P, 1,1, HI 

Va - ||0,1,2,2 1| 

Va - 110,0,1,111 


Prove that, if an ( n,n + 1) matrix A contains a column of elements which are not all zero 
and if every nth-order determinant in A which contains this column vanishes, then the rank 
of A is less than n. (Hint: Expand each of the vanishing determinants in terms of the ele- 
ments in their common column, consider the determinant of the resulting system of equa- 
tions, and use the result of Exercise 8, Sec. 10.3.) 

Prove that n vectors Vi, V a , . . . , V n are linearly dependent if and only if the so-called 
Gram determinant, or Gramian, 


Vi T Vi 

Vi T Va ■ 

• V^Vn 

TVTi 

Va T Va • 

• V^V n 

VJV \ 

VJVa ■ 

• • Vn T Vn 


If 4. is a square matrix, p a positive integer, and X a vector such that A p X ^ O but 
A p+1 X = O, show that the vectors X, AX, A*X, . . . , A p X are linearly independent, 
a Let A and B be matrices conformable in the order AB. Prove that the rank of AB is 
equal to the rank of B if and only if BX = O for every vector X such that ABX = O. 



SEC. 10.6 


MATR1C DIFFERENTIAL EQUATIONS 


461 


b If A is a square matrix and if p is a positive integer such that A* and A p+l have the same 
rank r, show that A** 3 , A“ +z , A p+i , . . . also are of rank r. 

26 Prove that, if A is an (n,n) matrix, then A*, A B+1 , A n+3 , ... all have the same rank. 


10.6 

Maine differential equations 

In Sec. 3.3 we saw that the ideas of complementary function and 
particular integral which we developed for single linear differen- 
tial equations could easily be extended to systems of linear 
differential equations. This analogy is especially striking when 
we regard a system of linear differential equations with constant 
coefficients as a single matric equation, much as we regarded a 
system of linear algebraic equations as a single matric equation 
in the last section. Moreover, the procedure for handling systems 
of equations when the characteristic equation has repeated or 
complex roots or when a term on the right-hand side duplicates 
a term in the complementary function is best described in the 
language of matrices. Hence we shall conclude this chapter with a 
brief discussion of matric differential equations. 

Let the system we are given be 

pn(D)xi + pn{D)x 2 +•••'+ Pm(D)x n = fi(t ) 

m Pn(D)x 1 + pn(D)x 2 Ps n (D)x n = f 2 (t) 


pnl(D)Xl 4- Pni(D)Xz +■*.••+ Pnn(D)X n = f n (t ) 

where the pifs are polynomials in the operator D with constant 
coefficients. If we define the matrices 


Pn(D) 

Pn{D) • 

• ■ Pm(D) 


Xl 


M) 

Pn(D) 

Pn{D) ■ 

■ ■ P2n(D) 

X = 

x 2 

F(t) - 

M) 

Pm(D) 

p»m * 

• ' Pnn(D) 


x n 


m 


the system (1) can be written in the compact form 

(2) P(D)X = F(t) 

The associated homogeneous equation is, of course, 

(3) P(D)X = 0 

The first step in finding the complementary function of 
Eq. (2) is to assume that solutions of Eq. (3) exist in the form 

X = Ae mi 

where the scalar m and the column matrix of constants A have 
yet to be determined. [The expressions 

x = ae mt y — be mt z — ce mt 


462 


DETERMINANTS AND MATRICES 


CHAP. 10 


in Eq. (3), Sec. 3.3, are, of course, just the scalar form of this 
assumption for the special case n — 3.] Since 
I¥(e mt ) = m r e mt 

it follows that, if we substitute the vector X ~ Ae mt into the 
homogeneous equation (3), we obtain just 
P(m) J 4.e“ i = 0 

or, dividing out the nonvanishing scalar factor e ml t 

(4) P(m)A = 0 

This is the matric equivalent of the algebraic system in Eq. (4), 
Sec. 3.3, which we obtained in our scalar treatment of the specific 
system of differential equations we considered in that section. 
Now by Corollary 1, Theorem 7, Sec. 10.5, Eq. (4) will have a 
nontrivial solution if and only if 

(5) |P(m)| 0 

and for each root m, of this equation there will be a solution vector 
Aj of (4) determined to within an arbitrary scalar factor k,. If the 
characteristic equation (5) is of degree N and if its roots {m, } are 
all distinct, a complete solution of Eq. (3) — and the comple- 
mentary function of Eq. (2) — is then 

X = k\Aie m ^ + kvAse™** +•••-}- k^e™** 

This we recognize as the matric equivalent of the scalar system 
(9), Sec. 3.3, with N = 3 and 


II 

-1 

2 

As = 

3 

-1 

As = 

9 

-6 


-1 1 


-1 


-1 


As in the case of a single scalar differential equation, if the 
set of roots {m,-j includes one or more pairs of conjugate complex 
roots, it is desirable to reduce the corresponding complex expo- 
nential solution to a purely real form. To see how this can be 
accomplished, let p ± iq be a pair of conjugate complex roots 
of Eq. (5), and let A be a particular solution vector of (4) cor- 
responding to the root m — p 4- iq] that is, let 

P{m)A sa P(p -j- iq)A = 0 

Then, since all the coefficients in (4) are real, it follows by taking 
conjugates throughout the system that 

P(m)A e p(p — iq)A = 0 

Thus A is a solution vector corresponding to the conjugate root 
m — p — iq 

and, therefore, we have the two particular solutions of Eq. (3) , 
Ae< D+iaX an d Ae iv ~ iq)t 


SEC. 10.6 


MATRIC DIFFERENTIAL EQUATIONS 


463 


By combining these as follows and applying the Euler formulas, 
we obtain two independent real solutions: 

i e (p+w + I e (H«)i (A + A M A - A . ,\ 

2 = e ( — 2 — C0S q 2i — 8111 qt ) 

(6a) = e* ( [Gl(A) cos qt ~ 3(A) sin qt] 

Ae (p+i * u — ,(A-A , , A -f A . \ 

2 i (~w~ cos ql + 2 sin qt ) 

(6b) = e pt [3(A) cos $ + (ft(A) sin gi] 

where (Si(A) and 3(A) denote the column matrices whose compo- 
nents are, respectively, the real parts of the components of A 
and the imaginary parts of the components of A. In many cases 
this method of determining the necessary relations among the 
coefficients of solutions of (3) of the form 

Xj — e pt (aj cos qt + bj sin qt) 

is simpler than the alternative process of substituting these 
expressions into the original differential equations, collecting 
terms, and equating the resulting coefficients to zero. 

If \P(m)\ = 0 has a double root, say m — r, we proceed 
very much as in the case of a single differential equation. If A 
is a solution of the equation P(r)A = 0, then, of course, 

Aa* 

is one solution of (3). However, as a second independent solution 
we must try not Bte Tt , as strict analogy with the scalar case would 
suggest, but rather 

(7) Bite* + B 2 e* 

The term B°e ri must be retained in the matric case because in 
general the matrix B 2 will not be a scalar multiple of A, and hence, 
in constructing the complete solution, the term B*e rt cannot be 
absoi'bed in the term Ae rt , as is necessarily the case for a single 
scalar differential equation. It can be shown, however, that to 
within an arbitrary scalar factor the matrix Bi is the same as A. 
Hence, after (7) has been substituted into the homogeneous sys- 
tem (3), it is only necessary to solve for the ratios of the compo- 
nents of the matrix B 2 . Similar observations hold for roots of (5) 
of higher multiplicity. Thus, for a fc-fold root r, the appropriate 
solutions are not Ae rt , Bte rt , . . . , Kt k ^ 1 c rt but rather 

Ae rt 

Bite* + B 2 e* 

Kit k ~'e* + K^-h* + • • • + K k e* 

In this case, to within arbitrary scalar factors, the matrices 
A, Bi, . . . , Ki are identical. 


464 


DETERMINANTS AND MATRICES 


EXAMPLE 1 

Find a complete solution of the system 

(D i + D+ 8)xi + (D* + 6 £> + 3)x, = 0 
(D + Dxj + (D 5 + l)a* = 0 

In this case the characteristic equation (5) is 
I (to 2 + to + 8) (m 2 + 6m 4- 3) 

| (to + 1) (to 2 + 1) 

■with roots 1, 1, — 1 ± 2 i. For the root — 1 + 2 i, Eq. (4) becomes 


m* + 2 m 2 — S m +5 = 0 


■ 1 + 2 i ) 2 + ( — 1 + 2 i ) + 8 

(-1 + 2i)- + 6( — 1 + 2i) 

(-1 + 2 i ) +1 

(-1 + 2 i )*'+ 1 

(4 - 2t) 

(-6 +80 || 

j| ai || || 0 J| 

2 i 

(-2 - 40 II 

|| 02 II || o || 


This is equivalent to the two scalar equations 

(4 — 2t)ai + (-6 +8i)a s = 0 
2mi — (2 + 4i)o2 = 0 

Since m = — 1 + 2 i is a root of the characteristic equation (5), these two equations are depend- 
ent, and the ratio of ai to a» can be found equally well from either of them. Using the second, 
since it is a little simpler, we therefore have 


ai 1 + 2 i 


[ 1 + 2i 


Hence, m(+)«|*| and 0(A) =||j 

and thus from (6) we have the two particular solutions 
Xi = e~‘ (| J || cos 2t - || * || sin 2 1 




cos 2 1 + 


| £|| sin a) 


For the repeated root m 

|| 10 10 || 
2 2 


= 1, we have one solution of the form He 1 , where, from (4), 

|j 6* Hlo | S ° We Can B » I) j 1 1| = J| _ ^ I 

As a second solution we have, from (7), 

C,te‘ + CW 

or, since Ci — B (as we observed above, without proof), 


+ 




Substituting this into the original system, we obtain two equations, each of which reduces to 
2c« + 2cm — 1 
Hence we can take* 

ci 2 = 0 and c lss = M 

The solutions associated with the double root m = 1 are, therefore, 


X 3 = 


-1 


and 


Xa-- 


-1 


Ue‘ + 


* The most general choice, c« = X, c 22 = (1 — 2X)/2, leads to the same 
expression for X 4 plus a matrix proportional to Xt which can be combined 
with ATa when the complete solution is constructed. 



SEC. 10.6 


MATR1C DIFFERENTIAL EQUATIONS 


465 


The complete solution of the original system is now 

X = r 1 * 3 * * * 7 I = ktXi + fc 2 X 2 + k 3 X 3 + kiXi 

II *s II 

or, in scalar form, 

Xi ~ e“ ! [(fei + 2 fc 2 ) cos 2 1 — (2fci — fc 2 ) sin 2t] + k 3 e* + kite ‘ 
x 2 = e~ f [k 3 cos 2 1 — ki sin 2 1] — (k 3 — MWs* — kite 1 

To find a particular integral of the nonhomogeneous system 
(2), we proceed very much as in the case of a single scalar equa- 
tion. In fact, for vectors F(t) which have only a finite number of 
independent derivatives and which do not duplicate vectors 
already in the complementary function, the results of Table 2.2, 
Sec. 2.3, can be used without change, provided only that the 
arbitrary scalar constants appearing in the entries in the table be 
replaced by arbitrary constant vectors. The trial solutions are 
then substituted into the nonhomogeneous system, and the 
arbitrary components of the coefficient vectors are determined 
to make the resulting equations identically true. The only sig- 
nificant difference between the scalar case and the matric case 
is that in the latter, when duplication occurs between a vector 
on the right of (2) and a vector in the complementary function, 
not only must the usual choice for a particular integral be multi- 
plied by the lowest positive integral power of the independent 
variable which will eliminate the duplication but the products of 
the normal choice and all lower nonnegative integral powers of 
the independent variable must also be included in the actual 
choice. 


EXERCISES 

Find a complete solution of each of the following systems: 


1 (D + 5)x + (D + 7)y = 2e‘ 

(2D + 1)* + (3D + 1)2/ - e' 

3 (D + l)x + (D + 2 )y - — e* 

(3D -f l)x + (4 D + 7 )y = -7e* 

6 (2D + l)x + (D + 2)y = 1 

(D + 2)x + (D + 4)y - 2 

7 (2D + l)x + (D + 2 )y = er‘ 
(3D - 7)x + (3D + l) tf - 0 


2 (D + 2)x + (D + 3 )y =2 t + i 

(2D - 6)* -f (3D - 4)y - -61 - 2 

4 (D + 1)* + (D + 2)y - -f + 1 

(5 D + l)i + (6D + 3)i/ = -2 1 + 1 

6 (D + l)x + (4D - 2)y = tr* 

(D + 2)x + (5D - 2 )y - tr* 

8 (2D + 1)* + (D + 2 )y = sin t 

(3D + 1)* + (3D + 5)i/ = cos t 


9 Show that D r (te mt ) = ni r te mt + rm r ~ 1 e mt . Hence show that 

p(D)te mt — p(m)te mt + p'(m)e mt 
and P(D)te mt = P(m)te mt + P'(m)e mt 

where p(D) is a polynomial in the operator D and P(D) is a matrix whose elements are 
polynomials in D. 

10 Using the results of Exercise 9, show that, if mi is a double root of the characteristic equa- 
tion |P(m)| = 0 and if Xi = Ae">‘ is one solution of the system P(D)X - 0, the coeffi- 
cients in the second independent solution X 2 ~ Bite m ^‘ 4- Bte m *‘ satisfy the equations 
P(mi)Pi = 0 and P(m)B 3 + P'(m,)Di = 0. 


CHAPTER ELEVEN 


Further Properties 
of Matrices 


11.1 

Quadratic forms 


In this section we shall continue our study of matrices by intro- 
ducing the important mathematical objects known as quadratic 
forms, her mitian forms, and bilinear forms: 

By a quadratic form we mean a homogeneous second-degree 
expression in n variables of the form 

Q(x) = flu-ri 2 + 2ai 2 zix 2 +*•'•+ 2a ln xix n 
+ 2a 22 Xi 2 + • • • + 2a in x 2 x n 
+ • ‘ • 4 - * • * 

+ O/nnXn 2 

Usually the cross products are separated into two equal terms, 
and the whole expression is written in the more symmetric form 

Q(x) - aux i 2 4 - amxixi 4 - • * • 4 - amx&n 

4 - a 2 \x 2 x\ 4 “ • ' • 4 " a 2n %2%n 


4 - a n lX n Zl 4 - anSXnXi 4 - • • ■ 4 - a nn X n ! 

where now, of course, a# = a,,-. If a quadratic form with real 
coefficients has the property that it is equal to or greater than 
(equal to or less than) zero for all real values of its variables, it is 
said to be positive (negative). A positive (negative) form which 
is zero only for the values xi = x 2 — • * • = x„ — 0 is said to be 
positive-definite (negative -definite). Positive-definite and nega- 
tive-definite forms are sometimes referred to collectively simply 
as definite forms. A positive (negative) form which is zero for real 
values other than = a? 2 = • • • = x n — 0 is said to be 
positive -semidefinite (negative -semidefinite). A real quadratic 
form which can take on both positive and negative values is said 


SEC. 11.1 


QUADRATIC FORMS 


467 


(2) 


to be indefinite. Examples of quadratic forms of each type are 
shown in the following table: 


table 1 1.1 


Type of quadratic form 

Example 

Positive-definite 

Xl 2 + X2 2 

Negative-definite 

— (X1 2 + X2 2 ) 

Positive-semidefinite 

(Xi - x 2 ) 2 

Negative-semidefinite 

— (®1 — X2) 2 

Indefinite 

Xl 2 — Xg 2 


If we define the matrices 


Xi 

X - Xl 

II Xn 

it is clear from the definition of matric multiplication that the 
quadratic form (1) can be written in the compact form 

Q(x) = X r AX A symmetric 

In this notation A is called the matrix of the quadratic form and 
is said to be positive- or negative-definite, semidefinite, or 
indefinite according to the nature of Q(x). Q(x), in turn, is 
said to be singular or nonsingular according as A is singular or 
nonsingular, that is, according as | A | is equal to zero or different 
from zero. 

If a quadratic form is definite, it is necessarily nonsingular, 
for we can write 

Q(x ) = (a n xi + ' • • + ai n x„)xi 

+ (O21X1 + * - ' + 0, 2nX n )Xi 

+ 

4 - (a-nlXl + • • • + a nn x n )x n 

and, if we suppose that |i4.| =0, then the system of equations 
obtained by equating to zero the expressions in parentheses has a 
nontrivial solution (Corollary 1, Theorem 7, Sec. 10.5); and for 
these values Q(x) is obviously equal to zero, contrary to the 
hypothesis that it is definite. The converse of this observation 
is not true, however; that is, a nonsingular quadratic form is not 
necessarily definite. For instance, the form 

xi 2 — 2 xix 2 + 2x 2 2 — X3 2 




an 

O12 

• • a l n 

and 

A = 

0 2 1 

O22 ‘ 

’ ‘ 0 2 n 



o»i 

On 2 ■ ' 

' ‘ Onn 



468 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


is nonsingular, since the determinant of its matrix, 



is different from zero; yet it is not definite, since it is equal to 
zero for the nontrivial values xi — 1, x 2 — 0, x s = 1. The com- 
plete criterion for the definiteness of a quadratic form is contained 
in the following theorem, for whose proof we must refer to texts 
on higher algebra.* 


THEOREM 1 

A necessary and sufficient condition that the real quadratic form X T AX be 
positive-definite (negative-definite) is that the quantities 

on ■ • • Ox,! 

a n i • • • a n „ 

all be positive (alternate in sign, with a n negative). 


Oil Ol2 
021 022 


Oil O12 O13 

021 O22 023 

031 O32 033 


Clearly, equivalent sets of necessary and sufficient conditions 
can be obtained by first permuting the variables and then apply- 
ing Theorem 1. This gives us the following somewhat more general 
theorems: 


THEOREM 2 

A necessary and sufficient condition that the real quadratic form X T AX be 
positive-definite is that every principal minor of A be positive. 


THEOREM 3 

A necessary and sufficient condition that the real quadratic form X r A X be 
negative-definite is that every principal minor of A of odd order be negative and 
every principal minor of A of even order be positive. 


example 1 


The quadratic form 

!|xi x« x 3 || • 



is positive-definite, since the three quantities 


Xi Xi 2 + 2X1X2 — 2XiXs 

x» = -t- 2 x 2 xi + 5 xa 2 — 4x2X3 
X) — 2x,-iXl — 4.T3X2 + 5X3 2 



I 1 2 1 

v ■ ■ 

1 2 -2 

and 

1 2 5 [ 1 

and 

2 5 -4 

-2 -4 5 


are all positive. In fact, the quadratic form can be written equivalently as 
(xi + 2x 2 — 2x 3 ) 2 + X 2 2 + x 3 2 


* See, for instance, W. L. Ferrar, "Algebra,” pp. 138-141, Oxford Book 
Company, Inc., New York, 1941. 



sec. n.i 


QUADRATIC FORMS 


469 


which, being a sum of squares, can vanish only if 

xi + 2x 2 — 2 x 3 = 0 and *2 = 0 and * 3 — 0 


and these, in turn, can hold simultaneously only if *r = X 2 = * 3 ~ 0. 
On the other hand, the quadratic form 


11*1 *2 *3 1| • 


1 2 
2 3 

-2 —4 


is not definite, since the three quantities 


—2 xt xr + 2*1*2 — 2xix 3 

— 4 • * 2 = -|-2*2*i + 3*o 2 — 4*2* 3 

5 * 3 —2*3*1 - 4*3*2 + 5 x s s 


and 


I 1 2 

I 2 3 


— — 1 and 


-2 | 
— 4 


5 


do not fulfill either of the conditions of Theorem 1. In fact, this quadratic form can be written 
as (*i + 2*n — 2* 3 ) 2 — * 2 2 + * 3 2 ; and, since this expression takes on the value 1 when *1 — 2, 
Xi = 0, *3 = 1 and takes on the value — 1 when *1 = — 2, * 3 = 1, x 3 — 0, it is actually indefinite. 


In our definition of a quadratic form, neither the matrix of 
coefficients A nor the matrix of unknowns X was restricted to 
be real. However, in most elementary applications both A and X 
will be real, and only for real quadratic forms are such properties 
as definiteness and indefiniteness defined. Actually, when com- 
plex quantities are involved, quadratic forms, as we have defined 
them, are almost always replaced by related expressions known as 
hermiti an forms : 


DEFINITION I 

If A is a hermitian matrix, the expression X T AX is known as hermitian forim 

Recalling the definition of a hermitian matrix (Sec. 10.2) it is 
easy to verify that any hermitian form is equal to its transposed 
conjugate. Moreover, since it is a scalar, i.e., a (1,1) matrix, it 
is also equal to its transpose. Hence, we have the following result : 

THEOREM 4 

The value of a hermitian form is real for all values of its variables. 

Because of Theorem 4, positive- and negative-definite, positive- 
and negative-semidefinite, and indefinite hermitian forms can be 
defined precisely as the corresponding types of quadratic forms 
were defined. Moreover, it can be shown that the criteria for 
definiteness contained in Theorems 1, 2, and 3 hold without 
change for hermitian forms. 

Closely associated with quadratic forms are what are known 
as bilinear forms : 

DEFINITION 2 

If A is a symmetric matrix, the expression Y T AX is known as a bilinear form. 

Clearly, if Y — X, the bilinear form Y T AX becomes the quadratic 
form X T AX. If the components of Y are thought of as the 


470 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


coordinates of a “point” in a hyperspace of the appropriate 
number of dimensions, the bilinear form Y T AX is sometimes 
called the polar of the point Y with respect to the quadratic form 
X T AX. 

It is interesting to note that the scalar product of two vectors 
Y and X, namely, Y T X, can be thought of as the bilinear form 
Y T IX. The condition that Y and X be orthogonal is, then, just 
the condition that the bilinear form Y T IX be equal to zero. This 
suggests that the simple notion of orthogonality introduced in 
Sec. 10.2, by analogy with the familiar results of solid analytic 
geometry, be extended to include the following concept of gen- 
eralized orthogonality : 

DEFINITION 3 

Two vectors X and Y are said to be orthogon al with respe ct to a symmetric, 
matrix A if the bili neaQom^^ . 

In the spirit of Definition 3, the notion of the length of a 
vector can also be generalized. In fact, the definitio n of th e length 
of a vector X intr oduced in Sec. 10.2, namely, -\/XYX, can be 
rewritten \/X T IX, and this suggests 

y/XXAX 

as the generalized length of the vector X with respect to the 
symmetric matrix A. This is meaningful, however, only if the 
quantitj r under the radical is positive. Hence, it is necessary to 
require further that A be the matrix of a positive-definite 
quadratic form. A vector whose generalized length with respect 
to a given symmetric positive-definite matrix is 1 is said to be 
normalized with respect to that matrix. A vector X can always 
be normalized with respect to a given symmetric positive-definite 
matrix A by dividing it by the positive quantity \/X T AX. 

Clearly, the Schmidt orthogonalization process, which we 
discussed at the end of Sec. 10.5, can be carried out equally well 
using the concepts of generalized orthogonality and generalized 
length. The notion of orthogonality with respect to a nonunit 
matrix will be of considerable importance in the work of this 
chapter. 

Just as it was convenient in analytic geometry to be able 
to remove the cross-product term from the equation of a conic, 
so in many applications involving quadratic forms it is desirable 
to be able to remove the cross-product terms by a suitable trans- 
formation and express the quadratic form as a sum of squares. 
There are many ways of doing this, among which the following, 
due to Lagrange, is particularly effective. The general idea is 
first to group together all terms containing Xi as a factor and, 
by suitable manipulations, make this expression a perfect square. 
Then, among the terms that remain, those which contain x 2 
as a factor are rearranged into an expression which is a perfect 


SEC. 11.1 


QUADRATIC FORMS 


471 


square; and so on, until the process terminates. The original form 
will then appear as a sum of squares of linear polynomials in 
(x h xz, . . . ,x n ), (x 2 ,x 3 , . . . ,x n ), ... , (xn-hZn), and x n , re- 
spectively; and when these expressions are taken as new variables 
the reduction is complete. 

To begin the process let us assume that an ^ 0 and group 
together all terms containing #i as a factor: 

(auxi 2 + 2a 12 x 1 x 2 + • • • -f- 2a ln x 1 x n ) + ^ a n x r x t 

— an (xi 2 d — XiX 2 + • • • + — — 4 - <l>i(x 2 ,x z , . . . ,x n ) 

\ «n flu / 

Now, adding and subtracting the appropriate terms, none of 
which involves xi, we have 

an \(xi + ~ x 2 + * • • + — x^) 

L\ °n a n J 



+ <i>i(x 2 ,x z , . . . ,x n ) 


— «n ( #i + — x 2 -f • • • + ” x n ) + ^> 2 (x 2) X3, . . . ,x n ) 
\ ®n an / 

The obviously nonsingular transformation 

i &12 i i $ln 

V\ ~ Xi + •— ■ X 2 + ’ * • + — ~ X n 
an an 

TV Vi = x 2 

Vn = Zn 

now reduces Q(:c) to the form 
an?/i 2 + <£2(2/2, 2/3, . . . , 2 /») 

where <£2(2/2, 2/3, • • • , 2 /«) is, of course, a quadratic form in the 
n — 1 variables 2/2, Uz, ■ . , 2/«> with coefficients 6 »v, say. 

The same process is next applied to <£2, and a second non- 
singular transformation, of the form 


472 


FURTHER PROPERTIES OF MATRICES 


CHAP. U 


extends the reduction to 

Ou2l 2 + hiiZt* + fj>z(z?,.Z4, . . . ,Z tt ) 

The continuation is now obvious, and the required transformation 
is, finally, the product of the successive transformations T h T*, 

. . . , T n . 

If at any stage all square terms are missing from the form 
4>i{ui,v,i+i, . . . ,m„) the process must be modified. If this occurs, 
either no more terms remain and the reduction is complete, or 
else there is at least one cross-product term with nonzero coeffi- 
cient, say UjUj+i. If this is the case, the nonsingular transformation 

Ui = u[ 


Uj-i — Uj_ t 

Uj = Uj 4 - Uj +1 
Uj+I = Uj — Uj + i 
UjJri — Wy+ 2 

U n — U n 

will clearly introduce a term in (u'-) 2 , and the process can be 
continued in its original form. 

It is important to note that the linear transformation 
employed at each stage is rank-preserving. Hence, since the rank 
of a diagonal matrix is equal to the number of its nonzero diagonal 
elements, it follows that, when X T AX is transformed to a sum 
of squares by the Lagrange reduction, the number of square 
terms present in the final result is equal to the rank of the matrix 
of the original form. It is also clear that, when a positive-definite 
quadratic form is reduced to a sum of squares by the Lagrange 
reduction, the final result must consist of the square of each 
variable with a positive coefficient. 

EXAMPLE 2 

Find a transformation which will reduce to a sum of squares the quadratic form X T AX, where 
Oil 
1 
1 

4 (I 

Following the Lagrange procedure, we first group together the terms containing * t as a 
factor, and then complete the square on these terms: 

(at 2 — 2xix* 4- 4xia: 5 ) + (2xs* — 4 * 2*5 + 2 * 2 X 4 + 5 * a 2 + 2x 3 * 4 + 4* 4 2 ) 

= {(Xt - *2 + 2*a) 2 — *2 2 + 4*2*5 ~ 4*J 2 ] 

+ (2*2 s — 4*2*3 + 2*2*4 + 5 * 3 5 + 2 * 3*1 4 " 4 * 4 2 ) 
= (*1 - *2 + 2* 3 ) 2 + (*2 2 + 2*2*4 + *J 2 + 2*3*4 4 - 4*4 2 ) 



SEC. 11.1 


QUADRATIC FORMS 


47 3 


Now we apply the transformation 

y i — zi — xz + 2x 3 

2/a = Xi 

Tu 

2/3 = X 3 

2/4 = Xi 

getting 2/i 2 + (?/ 2 s + 2y 3 y t + ya 2 + 2i/ 3 2/4 + 4 1 / 4 2 ) 

We next apply the same procedure to the function of y«, y 3 , y\ which remains: 

2/i 2 + (?/2 2 + 21/22/4) + (2/3® + 21 / 32/4 + 4 t/ 4 2 ) ■= 2/1 2 + [(2/2 + 2 /<t) a ~ 2 / 4 s ] + (y 3 - + 22/31/4 + 4 i/ 4 2 ) 
— yi 2 + (2/2 + 2/4)® + (z/3 2 + 22/3^4 + 3 ?/ 4 2 ) 

We now apply the transformation 
zi - Vi 

T z«. = Vi + 2/4 
2 " 23 == 2/3 

24 = 2/4 

getting 2i a + 22 2 + (23 s + 2^324 + 324 2 ) 

A repetition of the process now yields 

2l 2 + 0 2 2 + (*, + Z 4 ) 2 + 2*4* 

Hence, the final transformation 


W\ =» Si 



104 == 24 

reduces the original quadratic form to the expression 

wp + wp + w) 3 s + 2 u> 4 2 as required. 


The single transformation which accomplishes the reduction is of course the product of the 
transformations T 1 , 'I\, and T 3 , that is, the transformation which results when the y's and z’s 
are eliminated and the w’s are expressed directly in terms of then’s. This is easily found to be 


II 1 -1 2 0 1| 

T: W = PX where P = jj J J M 

|| 0 0 0 1 1| 

To verify that this transformation actually reduces X T A X to a sum of squares, it is neces- 
sary that the transformation T be solved for X, so that we can substitute for A r in the expression 
X T AX. To do this, we multiply both sides of the equation W = PX by the inverse of P, which 
surely exists since P is nonsingular. This gives us 

|j 1 1-2 1 

2P-*: X-P-'W where P -1 — ° * ° 

0 0 1-1 

II 0 0 0 1 


Under T~ 1 the original form becomes 


(P-'W) T A(P-'W) = W^P-'FAP-'W = W T [{P-') T AP~W 


474 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


and it is easy to -verify that {P~ l ) T AP~ l is indeed the diagonal matrix 

1 0 0 Oil 
0 10 0 
0 0 10 
0 0 0 2 1| 

If f(x i,xs, ... ,x n ) is a function of n variables which 
possesses first partial derivatives in the neighborhood of some 
“point” P: (ai,a 2 , . . . ,a„),.it is shown in calculus that a neces- 
sary condition for / to have a maximum or a minimum at the 
point P is that at P each of the first partial derivatives of / be 
zero. In elementary calculus, sufficient conditions for a point P 
to be a maximum or a minimum of / are usually not obtained ; 
but, with the fundamental properties of quadratic forms avail-, 
able, these can be formulated in a relatively simple way. In doing 
this, we shall use the Taylor expansion of / around the point P : 
{a h a 2 , . . . ,o„), which means that our conclusions are valid only 
for functions possessing such expansions. * 

Under our assumption that /has a Taylor expansion around 
the point P: (a h a 2l . . . ,a„), we can write, using the operational 
notation developed in calculus, 
f{x h . . . ,x n ) = /(oj, . . . ,a„) 

+ [(*.-«,) -*->L 

+ ' ' ' 

+ (Xn ~ a n ) ~ j /( XI, . . . ,Xn) | oi 

. +|i[fe-«.)^+ • • • 

d l 3 I 

+ (Xn - a n ) — fix u . • . ,*») | oi 

+ 

Now, by hypothesis, 

M\ =0 

dXi |ai, ...,«» dX n loi, . . . ,a« 

Hence, letting X, = Xi — a,-, we have 
(3) ' f(x i, . . . ,x„) ~f(a h ... ,a n ) 

“ I [ Xl al; + — AJL 

-j- terms involving the third and higher 
powers of the infinitesimals Xi, . . . , X» 


* Actually, if we use Taylor’s theorem rather than Taylor’s expansion, we 
need assume only the existence of the third derivatives of /. 



SEC. 11.1 


QUADRATIC FORMS 


475 


Clearly, in the neighborhood of P: (a u . . . ,a n ) the principal 
part of the right-hand side in the last expression is in general the 
first group of terms, which together constitute a quadratic form 
in the X’s in which, specifically, the coefficient of the product 
XiX, is 


ay | 

dXi dxj |oi, . . . ,o„ 


i 9 * j 


and 


1 ay | 

2 8X* L On 


i =j 


Now P\ (ox, . . . ,a„) will be a local maximum of / if and 
only if the difference f(x h . . . ,x „ ) — f(ai, ... ,a n ) is negative 
for all sufficiently small values of X,- = £,• — o, (r= 1,2, . . . , n) 
which are not all zero. And this will be the case if the quadratic 
form in the X’s is negative-definite. Similarly, P will be a 
local minimum if and only if the difference f(x i, . . . ,x n ) — 
f(ai, . . . ,a„) is positive for all sufficiently small values of X* 
which are not all zero, and this will be the case if the quadratic 
form in the X’s is positive-definite. The point P will be neither a 
maximum nor a minimum if the difference f(x i, . . . ,x K ) — 
/(ax, . . . ,a„) is sometimes positive and sometimes negative in 
the neighborhood of P, and this will, be the case if the quadratic 
form in the X’s is indefinite. Finally, if the quadratic form in the 
X’s is semidefimte, there are points distinct from, but arbitrarily 
close to, P at which the form is zero, and it is therefore not the 
principal part of the right-hand side of Eq. (3). In this case, a 
decision requires a consideration of the cubic (quartic, . . .) 
terms in X, and takes us beyond the bounds of matrix theory. 


EXAMPLE 3 

Examine the. function 


= 35 — Qxi + 2x 3 + an 2 — 2xix t + 2x«- + 2xtx 3 + 3x s 2 

for maxima and minima. 

To determine at what points, if any, the given function may have maxima or minima, we 
investigate the solutions, if any, of the three equations 


*[_ s 

ctel 

df 


-6 + 2xi - 2 x 2 


- 2xi + 4x 2 + 2a; 


df 

— = 2 +2x 2 + 6x, = 0 

dXa 

From these we find that the only possibility for a local extremum is the point P: (8 r 5, — 2). 

a/ 


Clearly, /( xi,X 2 ,x 3 ) = 9 and = — = 
dx.i dXi 


<r-f 


1! , 

dX'P 


ay = 

3a; i dXi 

a 2 / = 

dx 2 dx 3 


• — 0 at P. Moreover, at P, 

ZL-.o 

dXi 3^3 

iV-e 

3a;> 2 



476 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


Hence, 

f(x u Xt,x,} — 9 = (2(xi — 8) I 2 — 4(x t — S)(x a — 5) 

+ 4fe - 5) 2 + 4(x a - 5)(x* + 2) + 6(x a + 2) 2 1 + • • • 

The second-degree terms in (xi — 8), (x 3 — 5), and (x 3 +. 2) "in this expansion constitute a 
quadratic form whose matrix is 

| 1 -1 Oil 

-1 2 1 

I 0 1 3 1] 

By Theorem 1, this quadratic form is positive-definite; hence, the point (8,5, —2) is a local 
minimum of the given function. 

EXERCISES 

1 Classify each of the following quadratic forms: 
a Xi 2 + 4x 2 s + 4x a s 4- 4x1X5 + 4xtx 3 + 6x 2 x 3 
h 3xi 5 * + 3x s 4 ■+• 6x a 2 — 2xix« — 4xix a 

c — xi 4 — 3xs 4 — 5x/ 4 2xiXs + 2 xiX 3 4 2x 2 x® 
d 2xi 4 4 2x 2 s + Xa" 4 2xiX 3 4- 2x 2 x 3 

2 Find a transformation which will reduce each of the following quadratic forms to a sum of 
squares: 

a xi 4 + 5 x 2 4 4 2xr 4 4 xix 2 .*+ 2xix 3 + fixsx.i 
b xi 2 + 5x s 4 4 5x» 2 4 2xc — 2xix 2 + 4 xix 3 + 2xiXt — 6x 2 x< + 2x a x 4 
c xix 2 4 x a x 4 d xix 2 4- x a x4 4 x 4 x 5 + xsx 6 

3 If f(X) = F.4I, show that /(X X 4 mF) - \*X T AX 4 2\^X r A Y 4- ^Y T A Y. 

4 If A is symmetric, show that Y T AX = AT4F. 

5 Examine the following functions for maxima and minima: 

a 2xi 2 4- 2x1X5 4- is 2 4 6xi 4- 6x5 4- 3 
b Xi* — 2x1X2 4- 2xix 3 — 4*2X3 4- 4xi — 4x2 4 4x a 4- 4 
c — 2xi 4 — 2 x 2 2 — x 3 4 4 2xixs + 2x 2 x 3 4 4xi — 4x2 4- 2x 3 — 3 

d xi 3 4 - X2 2 — 3xi e xi 3 — 3xi*s 4* %z 3 

f sin xi 4 sin x 2 4 cos (x t + x s ) 

6 Show’ that in the neighborhood of any zero of an indefinite quadratic form the form takes on 
both positive and negative values. 

7 Prove that a nonsingular quadratic form cannot be semidefinite. 

8 If X T AX is a positive-definite quadratic form, show that (A’’ 7 ’.! I') 3 g (X T AX)(Y T AY), 
the equality sign holding if and only if either A' or Fis a null vector or A = F. (Hint: Use 
the result of Exercise 3.) 

9 From the vectors F t = ||1 0 0||, F» = ||0 1 Of, F a = ||0 0 1||, construct a set of 
vectors orthonormal with respect to the matrix 

I 1 1 0 1 

1 2 ° 

I 0 0 2] 

10 a Find the potential energy stored in the system shown in Fig. 10.1, Sec. 10.3, as a result 

of the displacements Xi, x 2 , and x 3 , and show that it is a positive-definite quadratic function 

of the x’s. (Hint : The work required to stretch a spring of modulus k a distance s is l$ks 2 .) 

b Work part a for the system shown in Fig. 10.2, Sec. 10.3. 



SEC. 11.2 


THE CHARACTERISTIC EQUATION OF A MATRIX 


477 


11.2 

The characteristic equation of a matrix 


In studying linear transformations of the form 



th 


an 

ai 2 • ■ 

' • ®1» 


Xi 

Y = AX where Y = 

Vi 

A = 

#21 

a 2 2 • • 

’ ‘ ®2n. 

X = 



Vn 


#71 1 

®n2 ‘ ' 

' * dnn 


Xn 


it is an interesting and important problem to determine what 
vectors, if any, are left unchanged in direction. Since two non- 
trivial vectors have the same direction if and only if one is a non- 
zero scalar multiple of the other, this is equivalent to the question 
of determining those vectors X whose images Y are of the form 
Y = XX, that is, those vectors X such that 

AX = XX or (A - XI)X = 0 

Clearly, the matric equation (A — XI) X — 0 is equivalent to the 
scalar system 

(Oil — X)Xl + a 12*112 +•■•'+ dlnXn ~ 0 

"h (a 22 — X)x 2 -(- * • * a 2n .t„ = 0 


OnlXl + amXz + («n» — X)« n ~ 0 

and, according to Corollary 1, Theorem 7, Sec. 10.5, a homoge- 
neous equation of this sort will have one or more nontrivial solu- 
tions if and only if the determinant of the coefficients is equal to 
zero. This condition, namely, 




(an — X) ai2 

• • ai» 

(1) 

II 

1 

Zl 

® 2 i (a 22 — X) 

‘ * ®2» 



a»i ...... . 

• • (flnn ~ X) 


is obviously a polynomial equation of degree n in the parameter 
X with leading coefficient (— l) n : 

(2) | A - X/j - (— 1)»[X« - iSiX"- 1 -j- jSaX” -2 • • • 

+ (-l) n-I /3„_iX + ( — l) n /3„] = 0 

Both this equation and the equivalent equation obtained by 
dropping the factor (—1)" are known as the characteristic equa- 
tion of the matrix A, and the expression in brackets is known as 
the characteristic polynomial of A. For values of X which satisfy 
Eq. (2) and for these values only, the matric equation 
(A — XI) X = 0 has nontrivial solution vectors. The n roots of 
Eq. (2), which of course need not be distinct, are called the 
characteristic roots or characteristics values of the matrix A, 
and the corresponding solutions are called the characteristic 


FURTHER PROPERTIES OF MATRICES 


. 


vectors* of A. Although we have introduced the idea of char- 
acteristic values and characteristic vectors of a matrix in connec- 
tion with the specific problem of determining the vectors left 
invariant in direction by a linear transformation, these concepts 
are of fundamental importance in much of matrix theory and in 
many of its physical applications. 

Since most applications involve matrices which are either real 
and symmetric or hermitian, we shall for the most part restrict 
the rest of our discussion to the characteristic values and charac- 
teristic vectors of such matrices. We begin, however, with several 
theorems which deal with the characteristic values and character- 
istic vectors of arbitrary square matrices. 

Since the characteristic equation (2) of a square matrix A is a 
polynomial equation, its roots, say Ai, \ 3) ... , X,„ are connected 
with its coefficients — 0i, 0 2 , . . . , (— 1)"0» by the well-known 
root-coefficient relations: 

0i = Ai + As -+••••+ A n 

. 02 = X 1 X 2 + X 1 A 3 An— 1 An 

( 3 ) 

0n ~ A 1 X 2 ' ‘ * An 

Furthermore, if we set A = 0 in Eq. (2), we obtain 

(4) \A\ - (-1) 2 »0„ = 0„ 

Hence, from the last of Eqs. (3), we have 
\A\ = A x A 2 • * * A„ 

From this it follows that |A| is zero if and only if at least one of the 
A’s is zero. Thus we have established the following theorem: 

THEOREM 1 

A matrix is singular if and only if at least one of its characteristic values is zero. 

Equation (4) is only the first of a series of relations connecting 
the coefficients in Eq. (2) with the principal minors of A. For 
instance, if \A — A/| is written as the sum of 2 n determinants by 
repeated use of the addition theorem (Theorem 9, Sec. 10.1), it is 
clear that the terms containing the first power of A are obtained 
by multiplying the term —A in each diagonal element of \A — A/| 
by the A-free part of the cofactor of that element. Thus, the coeffi- 
cient of A in Eq. (2), namely, 

(~1) 2 »~ 1 0„_ 1 = -0n_! 

is equal to — (An A 22 ' * ■ -h A nn ) 


* Some writers graft the German word eigen meaning own, peculiar, or proper, 
onto the words values and vectors and use the hybrid terms eigenvalues and 
eigenvectors. 



SEC. 11.2 


THE CHARACTERISTIC EQUATION OP A MATRIX 


479 


Hence, it follows that 

fin- 1 = An + A 22 + ‘ * - + A n „ 

Similarly, the terms containing X 2 in the expansion of \A — XI | 
are found by multiplying the terms containing —X in every pair 
of diagonal elements by the X-free part of the algebraic comple- 
ment of the second-order minor containing those diagonal ele- 
ments. Thus the coefficient of X 2 in Eq. (2), namely, 

(-l) 2 »- 2 /3 n _2 = 0 n _ 2 

is equal to Aj. 2,12 4 Ai3,i3 -t- ■ ■ ■ ■ -f- A B _i, n; „_i in 
The continuation is obvious, and we, therefore, have the follow- 
ing theorem: 


THEOREM 2 

If X” — /3 iX” 1 4 • • • + (— l) n-1 j3 n _iX 4 (— l)"0 n = 0 is the characteristic 
equation of a square matrix A, then fii is equal to the sum of all the principal 
minors of order i in A. 

For i = 1 we have, as a special case of Theorem 2, the relation 

01 = Xl + X 2 + • ' • -j- X„ = flu -4* O 22 4 ‘ * • 4* &nn 

The quantity an + o 52 + • • • + a n « is called the trace of A. 

The characteristic polynomial of a matrix A and, hence, the 
coefficients {&•} and the characteristic roots {X>} have the inter- 
esting property of being invariant under any similarity transfor- 
mation. More precisely, we have the following theorem: 


THEOREM 3 

If A and B are similar square matrices, then A and B have the same characteristic 
polynomial. 

PROOF Let | A — XI | be the characteristic polynomial of the matrix A, and 
let B be a matrix similar to A ; i.e., let B be any matrix such that B — S~ 1 AS. 
Then the characteristic polynomial of B is 
| B - X/| = | S-'AS - X/| 

= | S-'AS - \S-US\ 

= IS -1 (A - \I)S\ 

- IS- 1 ! • \A ~ X7| • |5| 

since the determinant of a product of square matrices is equal to the product of 
the determinants of the individual matrices. Moreover, by Corollary 2, Theorem 1, 
Sec. 10.3, | S~ l \ • \S\ = 1. Hence, |5 - X/| - |A - X/|, as asserted. 

The next three theorems also deal with the characteristic 
values and characteristic vectors of arbitrary square matrices: 

THEOREM 4 

A characteristic vector of a square matrix cannot correspond to two distinct 
characteristic values. 


480 


FURTHER PROPERTIES OF MATRICES 


CHAP. II 


PROOF Let \i and X 2 be distinct characteristic values of a square matrix 
A, and let Xi be a characteristic vector of .4 corresponding, if possible, to both 
Xi and X 2 . Then, simultaneously, 

(A - \iI)X i = 0 and (A - \J)Xi = 0 
Hence, subtracting, 

(n) (X 2 — X\)IXi — (X 2 — Xi)A-i ■ — 0 

However, by hypothesis, Xi X 2 . Moreover, a characteristic vector is, by defini- 
tion, a nontrivial solution vector of (A — \I)X = 0. Thus Ii ^ 0, and therefore 
Eq. (5) cannot hold. Hence, the assumption that a characteristic vector can 
correspond to two distinct characteristic values must be abandoned, and the 
theorem is established. 

TH EO R EM 5 

If Xi, Xi, . . . , X m (m ^ n) are characteristic vectors corresponding respectively 
to the distinct characteristic values Xi, X 2 , . . . , of an (n,n) matrix A, then 
Xi, X z , . . . , X m are linearly independent. 

PROOF Let Xi, X 2 , . . . , X m be characteristic vectors corresponding respec- 
tively to the distinct characteristic values Xi, X 2 , . . . , X m of a square matrix A, 
and let us suppose, contrary to the theorem, that X 2 , . . . , X m are depend- 
ent. More specifically, let us suppose that the maximum number of linearly 
independent vectors in the set is k, where 1 jg k < m, and, for convenience, 
let them be the first k X’s. Then the relation 

ofiXi + a 2 X 2 ■+ • • * 4- oikXic ~ 0 

implies that on — a 2 = • • • = a* — 0, but there does exist a nontrivial set 
of jS’s, with @k+x ^ 0, such that 

(6) P1X1 A- p 2 X 2 -{-•••+ p kXk 4- fiic+iX k+ .i — 0 

Now multiply Eq. (6) on the left by the matrix A, getting 

PiAX 1 4- P 2 AX 2 4- ■ * * pk AX k A~ Pk+\AXk+i = 0 

However, AX% — X ,- Xi for each i. Hence the last equation becomes 
0) PikiXi 4- P2X2X2 + • • • + PkXkXk Ar Pk+Xk+iXk+i — 0 

If we now multiply Eq. (6) by Xt+i and subtract from Eq. (7), we obtain 
(8) (X, — Xk+i)PiXi A- (X 2 ~ Xk+i)p2Xi Ar ' ' * -|- (X* — \k+i)PkXk = 0 

Since Xi, X», . . . , X* are linearly independent, by hypothesis, it follows that 
each coefficient in (8) is equal to zero. Hence, since X t - — X& + i 7* 0 (£ = 1, 2, 

• . . , k), it must be that 

Pi — 0 4 = 1, 2, k 

But, if this is the case, it follows from (6) that 
Pk+lXk+l = 0 

which is impossible, since neither the scalar pk+i nor the vector Xt + i is zero. 
This contradiction overthrows the possibility that the characteristic vectors 
X h X it . . . , X m are linearly dependent, and the theorem is established. 


SEC. n.2 


THE CHARACTERISTIC EQUATION OF A MATRIX 


481 


In particular, if an (w,n) matrix has n distinct characteristic 
values, the last theorem tells us that it has n linearly independent 
characteristic vectors. Hence, using Corollary 1, Theorem 10, 
Sec. 10.5, we have the following result: 

COROLLARY 1 

If the characteristic values of an (n,ri) matrix A are all distinct, then A has n 
linearly independent characteristic vectors, and any vector with n components 
can be expressed as a linear combination of the characteristic vectors of A. 

Since the characteristic equation of an (n,n) matrix is always 
of degree n, it is obvious that, if repeated roots are counted the 
appropriate number of times, such a matrix always has exactly n 
characteristic roots. With the same convention one might perhaps 
be able to say that an (n,n) matrix always has exactly n char- 
acteristic vectors. However, attempting to assign a multiplicity 
to a characteristic vector associated with a repeated characteristic 
root is completely artificial and without significance. The decisive 
consideration is the number of linearly independent characteristic 
vectors of a given matrix; hence, it is of fundamental importance 
to know when more than one independent characteristic vector 
is associated with a repeated characteristic x'oot. The next theorem 
gives us a partial answer to this question: 

THEOREM 6 

If Xi is a characteristic root of multiplicity r of an (n,n) matrix A, then the rank 
of A — XiJ is equal to or greater than n — r. 

PROOF If A is an (n,n) matrix and if X — Xi is a repeated root of multiplicity 
r of the characteristic equation \A — X7| = 0, then, writing X = Xi + w, it is 
clear that W\ — 0 is a repeated root of multiplicity r of the equation 
| A - (Xi + w)I\ = | (A - Xj7) - wl\ = 0 
Hence, the expanded form of the last equation, say 

( — l) n [w n — <TiW n ~ 1 + <r 2 W n ~ z — • • • + ( — + ( — l)”®',,] = 0 

must contain w r as a factor and must, therefore, reduce to 

( — l)“[to n — <r itp” -1 +••■-{- (— l) n ~ r ff n - r w r ] = 0 
where o- n _ r 0. Now, by Theorem 2, the coefficient o- n _ r is equal to the sum of 
all principal minors of order n — r of the matrix A — Xj7. Hence, since cr„_ r ^ 0, 
at least one of these minors must be different from zero. In other words, the 
rank of A — Xi7 must be at least as great as n — r, as asserted. 

If, for a particular root Xi of multiplicity r of an ( n,n ) 
matrix A, the equality sign holds in the assertion of Theorem 
6, then, according to Theorem 6, Sec. 10.5, there are exactly 
n — (n — r) - r linearly independent characteristic vectors as- 
sociated with Xi. Such a characteristic root is said to be regular. 
However, this is the exception rather than the rule, and, in 
general, there will be a single independent characteristic vector 


482 


FURTHER PROPERTIES OF MATRICES 


CHAP. U 


associated with a repeated characteristic root of any multi- 
plicity. For instance, for the matrix 



-7 

4 

2 


— 5 
3 
2 


we have |A — X/j = 


~3 - X -7 -5 I 

2 4 - X 3 

1 2 2 - X I 

-X 3 + 3X 2 - 3X + 1 = -(X - l) 3 = 0 


Thus, A has a single characteristic root X = 1. Moi’eover, for 
X = 1, the rank of 


|A - XZ|wi - \A - I\ 



is clearly 2. Hence, according to Theorem 8, Sec. 10.5, the system 
of equations (A — I)X = 0 has a single independent solution 
vector, namely, 



-3 

1 

1 


and A has just one independent characteristic vector. 

Later in this section we shall see that for hermitian matrices 
the assertion of the last theorem can be sharpened to a strict 
equality; in other words, we shall prove that, if X = Xi is a char- 
acteristic root of multiplicity r of a hermitian matrix A, then 
the rank of | A — Xi7| is exactly » — r. Preparatory to this, how- 
ever, it will be convenient to prove first some other theorems 
about hermitian matrices: 


THEOREM 7 

The characteristic values of a hermitian matrix are all real. 


PROOF Let A be a hermitian matrix; let Xi be any one of its characteristic 
values; and let Xi be a characteristic vector corresponding to Xi. Then 

(A - \ 1 I)X 1 « 0 
or 

(9) AX x = XiXt 

and from this, by premultiplying by Xi T , we obtain 

(10) X l T AX l = X 1 X 1 ' r X 1 

Now, from the properties of conjugate complex numbers, Xi T X x is real and in fact 
positive. Furthermore, from Theorem 4, Sec. 11.1, we know that X x T AX\ is also 
real. Hence, it follows immediately from Eq. (10) that Xi is real, as asserted. 


SEC. 11.2 


THE CHARACTERISTIC EQUATION OF A MATRIX 


Since, as we observed in Sec. 10.2, a real symmetric matrix 
is just a special case of a hermitian matrix, we have the following 
important corollary of Theorem 7 : 

COROLLARY 1 

The characteristic values of a real symmetric matrix are all real. 

Furthermore, since iA is hermitian if A is skew-hermitian 
and since |A — X/| = 0 implies \%A — i\I | — 0, it follows that, 
if Xi is a characteristic value of the skew-hermitian matrix A, 
then tXi is a characteristic value of the hermitian matrix iA. 
Hence, by Theorem 7, tki is real, and, therefore, Xx is a pure 
imaginary. Thus we have established the following result: 

COROLLARY 2 

The characteristic values of a skew-hermitian matrix are all pure imaginary. 

Knowing now that the characteristic roots of a hermitian 
matrix A are all real, we can return to the characteristic equation 
of A and prove the following result: 

THEOREM 8 

If X” - 0iX n-1 + 0 2 X n ~ 2 —•••'■+ (-ljn-i^x + (— 1)*|8„ = 0 is the char- 
acteristic equation of a hermitian matrix A, then the characteristic roots of A 
are all positive if and only if each 0 is positive. 

PROOF If A is a hermitian matrix, it follows from Theorem 7 that the roots 
of the characteristic equation 

X” - frX"- 1 + 0 2 X»- 2 - ■ ■ • + (— l) n-1 /3„_iX + (—l)“/3» = 0 

are all real. Furthermore, if each 0 is positive, it follows by Descartes’s rule of 
signs that no root of the characteristic equation can be negative or zero. Hence, 
all the characteristic roots must be positive. Conversely, if the characteristic 
roots of A are all positive, then from the root-coefficient relations 

01 = Xi + X 2 + ' ' * + Xn 

02 = XiX 2 + XiXs -f • • • + X„_xX„ 


0„ = XxX 2 • • ■ X„ 

it follows at once that each 0 is positive, as asserted. 

COROLLARY 1 

If X” - 0iX n—1 + 0 2 X"- 2 - • • • 4- ( — l) n_1 0 Il _iX + (-l) n 0» - 0 is the char- 
acteristic equation of a real symmetric matrix A, then the characteristic roots 
of A are all positive if and only if each 0 is positive. 

One of the most important properties of the characteristic 
vectors of a hermitian matrix is that of orthogonality. More 
precisely, we have the following theorem: 





484 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


THEOREM 9 

If Xi and Xj are characteristic vectors corresponding, respectively, to the distinct 
characteristic values Xy and X, of a hermitian matrix A, then XFXj =. 0. 

PROOF By hypothesis, we have 

(11) AXi = \iXi 

(12) AXi = XjZy 

If in the first of these we take the conjugate and then the transpose of each 
member, we obtain 

Xi T A T = hX? 

or, since A r ~ A, by hypothesis, and X» = Xy, by Theorem 7, 

(13) Xi r A = \ili T 

Now, if we premultiply Eq. (12) by XJ and postmultiply Eq. (13) by Xj, we 
obtain, respectively, 

XiUXj = XsXXXj 
XAAXj = liti T Xj 

Finally, subtracting these equations, we have 
(Xi - \)Xi T Xj = 0 
or, since X, Xy by hypothesis, 

Xi 7 Xj = 0 as asserted. 

COROLLARY 1 

If Xi and Xj are characteristic vectors corresponding to the distinct character- 
istic values Xi and Xy of a real symmetric matrix, then XJXj = 0. 

We are now in a position to return to the question we raised 
earlier in this section about the rank of (A — Xi/ 1 when A is 
hermitian and X, is a characteristic root of multiplicity r. As 
the next theorem shows, every characteristic root of a hermitian 
matrix is regular; that is, if A is hermitian, then, for every char- 
acteristic root Xi of multiplicity r, the rank of )A — X,/| drops 
to the minimum permitted by Theorem 6, namely, n — r, and 
there are r linearly independent characteristic vectors correspond- 
ing to X<: 

THEOREM 10 

If A is a hermitian matrix, then to every r-fold characteristic root of A there 
correspond exactly r linearly independent characteristic vectors. 

PROOF Let Xi be a repeated characteristic root of a hermitian matrix A ; 
let U i be any normalized characteristic vector corresponding to Xt; and let U 
be any unitary matrix having U\ as its first column. In virtue of the Schmidt 
orthogonalization process, it is clear that such a matrix exists. Moreover, since XJ 
is unitary, that is, since tJ T = U~\ it follows from Theorem 3 that the matrices 
WAU -XL and A- X/ 


SEC 11.2 


THE CHARACTERISTIC EQUATION OF A MATRIX 


48S 


have the same characteristic equation and, therefore, the same characteristic 
roots. Now let us write U in partitioned form: 

u - II u x u* ■ ■ ■ u n \\ 


Then, remembering that AUi = X 1 U 1 since, by hypothesis, U i is a characteristic 
vector of A corresponding to Xi, we have 


U T AU = 

Ui T 

W' 

•All u x u 2 ■ ■ 

• U n || = 

Ui 
U 2 

T 

T 

! 

• || AUiAU.f 

• • 4(7.11 


UJ 

Ui T 

Ui T 

■ IIXiC/x Alh • 

• • AV» II 

Un 

T 

X 

0 

,i Ui T AU<> • • 
i U, T A U, • • 

• Ui r A U n 

• U/A Un 


UJ 




0 

i UJAUi • • 

■ UJAUn 


where the zeros in the first column of the last matrix enter because U is a unitary 
matrix, and, therefore, any two of its columns satisfy the relation 


tjfU } = 


i = j 
i * j 


The remaining entries are not in general zero, since Ui and U) are not orthogonal 
with respect to the matrix A. However, since A and, therefore, U T AU are hermi- 
tian (see Exercise 8), it follows that the elements after the first in the first row 
must all be zero. Thus 


Xi 

0 

0 

... Q 

0 

<*22 

<*23 

• • - <*2» 

0 

<*32 

<*33 

‘ - ' <*3n 

0 

<*n2 

<*»3 

’ ' ’ <*nn 


Xi - X 

0 

0 

0 

0 

<*22 — X 

<*23 

‘ <*2n 

0 

<*32 

C*33 — X • 

* <*3n 

0 

<*»2 

<*«3 

• ' Unn - X 


Therefore, if Xj is a repeated root of 

\U T AU - XI| = | A - XI| = 0 

then Xi — X must be a factor of the minor of the element in the first row and 
first column of | A — Xi/|. But if this minor vanishes when X = Xi, then the rank 
of A — Xi I is at most n — 2, since all other minors of order n — 1 obviously 
contain either a column of zeros or a row of zeros. Hence, by Theorem 6, Sec. 10.5, 

| A — X/ \X = 0 has at least two linearly independent solution vectors, and A has 
at least two linearly independent characteristic vectors. 

If the multiplicity of Xi is more than 2, the preceding argument can be 
repeated, using this time any unitary matrix U whose first two columns are any 
two orthonormal characteristic vectors corresponding to Xi. This leads to the 
conclusion that Xi — X must be a factor of the complementary minor of the 


486 


FURTHER PROPERTIES OF MATRICES 



second-order minor in the first two rows and first two columns of 
\U T AU - \I\ = | A- X/| 

Hence, since all other ( n — 2)-order minors obviously vanish, it is evident that, 
when X = Xi, the rank of A — XZ is not more than n — 3, and A, therefore, has 
at least three linearly independent characteristic vectors. Clearly, this procedure 
can be continued until we reach the conclusion that, if Xi is an r-fold characteristic 
root of A, then the rank of A is at most n — r, and hence A has at least r independ- 
ent characteristic vectors. But, by Theorem 9, A can have at most r independent 
characteristic vectors corresponding to an r-fold characteristic root. Hence, A 
must have exactly r linearly independent characteristic vectors, as asserted. 

Since, as we have repeatedly observed, a real symmetric 
matrix is a special case of a hermitian matrix, it is clear that 
we also have the following result: 

COROLLARY 1 

If A is a real symmetric matrix, then to every r-fold characteristic root of A there 
correspond exactly r linearly independent characteristic vectors. 

We are now in a position to prove the following fundamental 
theorem: 

THEOREM 1 1 

Every (n,n) hermitian matrix has n linearly independent characteristic vectors. 

PROOF Let A be an (n,n) hermitian matrix. It may, of course, possess one 
or more repeated characteristic roots, but, if it does, we know from the last 
theorem that to each root of multiplicity r there correspond exactly r linearly 
independent characteristic vectors. Hence, A cannot have more than n linearly 
independent characteristic vectors. Specifically, let the characteristic roots of A be 

Xi, X-jt, . .. , Xfc 1 => & =* M 

k 

let the multiplicity of X; be n, where £ n = n\ and let 
Xn, X i2 , . . . , X iri 

be n independent characteristic vectors corresponding to Xi. Suppose, now, con- 
trary to the assertion of the theorem, that these n characteristic vectors of A are 
not linearly independent. Then there exists a relation of the form 

(1^) (cuXn ' + Ci ri Xir t ) 4- (cnXn + • • • 4- c 2ri X 2rj ) 4~ 

' ' * 4" (fiklXkl + • * • + CkrtXkTk) = 0 
in which at least one c is different from zero. 

Now consider a typical group of terms, say the fth, in the last expression. 
By Theorem 3, Sec. 10.5, unless the c’s in such a group are all zero, the combina- 
tion defines a characteristic vector corresponding to the characteristic value 
X = Xi. Thus, Eq. (14) is simply an expression of the form 

ciXx + C2X2 + • • • +• CkXk 




SEC. 11.2 


THE CHARACTERISTIC EQUATION OF A MATRIX 


487 


in which each c is either 0 or 1 and at least one c is different from zero. But since 
the X’s now correspond to distinct characteristic values, it follows from Theo- 
rem 2 that they are linearly independent and, hence, that each c must be zero. 
This contradiction establishes the theorem. 



COROLLARY! 

Every real symmetric (n,n) matrix has n linearly independent characteristic 
vectors. 

If an (n,n) matrix has n linearly independent characteristic 
vectors, then, by means of the Schmidt orthogonalization process 
applied to the vectors in each of the sets corresponding to a 
repeated root, a set of normalized mutually orthogonal charac- 
teristic vectors can always be constructed. Hence, we have the 
following important result: 

COROLLARY 2 

Every (n,n) hermitian or real symmetric matrix has a set of n orthonormal 
characteristic vectors. 

An (n,n) matrix whose columns are orthonormal characteristic 
vectors of an (n,n) matrix A is said to be a modal matrix of A. 

In many applications in physics, chemistry, and engineering 
it is necessary to consider matric equations of the form 
(A — \B)X - 0 in which A and B are either hermitian or 
real and symmetric. Such an equation will, of course, have 
nontrivial solutions if and only if the determinant of the co- 
efficients is equal to zero. Paralleling our earlier terminology, 
the equation | A — X2?| = 0 is called the characteristic equation 
of the system, its roots are called the characteristic roots or 
characteristic values of the system, and the corresponding non- 
trivial solutions are called the characteristic vectors of the system. 
As one should expect, the theory of the equation (A — \B)X = 0 
resembles closely the theory of the equation (A — \I)X — 0. In 
particular, we have the following results: 

THEOREM 1 2 

The equation (A — \B)X = 0 has zero as a characteristic root if and only if A 
is singular. 

PROOF This follows immediately from a consideration of the characteristic 
equation | A — • Xi3| =0 when the left-hand side is expressed as a polynomial in X. 

THEOREM 13 

If A and B are hermitian (or real symmetric) matrices and if B is definite, then 
the characteristic values of (A — \B)X = 0 are all real. 

PROOF Let A and B be hermitian (or real symmetric) matrices, let B be 
definite, and let Xi be a characteristic vector of the equation 

(A - \B)X = 0 



488 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


corresponding to the characteristic value Av Then 

(15) AXi = \ l BX 1 

Hence, premultiplying Eq. (15) by Xi T , we have 

(16) Xi T AX i = \iXi T BX i 

Now, from Theorem 4, Sec. 11.1, we know that both X-?AX\ and X J BXi are 
real numbers. Moreover, since B is definite, Xi T BXi ^ 0. Hence it follows from 
Eq. (16) that Ai is real, as asserted. 

By inspection of Eq. (16), the following results are obtained 
immediately: 

COROLLARY 1 

If A and B are hermitian (or real symmetric) matrices which are both positive- 
definite or both negative-definite, then the characteristic values of 

(A - Ai B)X = 0 

are all positive. 

COROLLARY 2 

If A and B are hermitian (or real symmetric) matrices and if A is positive-definite 
and B is negative-definite or vice versa, then the characteristic values of 

(A - \B)X = 0 

are all negative. 

THEOREM 14 

If A and B are hermitian (or real symmetric) matrices and if X h X 2 , . . . , X k 
are characteristic vectors of the equation (A — \B)X — 0 corresponding, respec- 
tively, to the distinct characteristic values Ai, X 2 , . . . , At, then the X’s satisfy 
the generalized orthogonality condition 

XfBXj = 0 (or Xi T BXi = 0) i * j 

PROOF Let A and B be hermitian matrices, let X< and Ay be distinct char- 
acteristic values of the equation (A — \B)X = 0, and let Xi and X, be char- 
acteristic vectors corresponding respectively to A » and Ay. Then 

AXi - A, BXi and AXy = AyBXy 

If we premultiply the first of these equations by Xj T and the second by Xi T , we 
obtain, respectively, 

(17) it? AXi — \iXj T BXi 
and 

(18) Xi T AXj = XjX/BXj 

Now, if we take the transpose and then the conjugate of each side of Eq. (17), 
remembering that A and B are hermitian, we obtain 

(19) Z/AXy = XiXFBXj 


SEC. 11.2 


THE CHARACTERISTIC EQUATION OF A MATRIX 


489 


Finally, subtracting Eq. (18) from Eq. (19), we have 
(Xi ~ \)X i T BX j = 0 

Therefore, since X, ^ Xy, by hypothesis, it follows that 

XfBXj =0 i j* j as asserted. 

If A and B are real symmetric matrices, an almost identical proof carried through 
with Xj T and Xi r replacing Xj T and XX, serves to establish the parenthetical asser- 
tion of the theorem. 

COROLLARY 1 

If A and B are hermitian (or real symmetric) matrices and if Xi and Xj are 
characteristic vectors of ( A — \B)X = 0 corresponding, respectively, to the dis- 
tinct characteristic values X» and X ; , then XXAXj - 0 (or XfAXj - 0). 

PROOF This result follows immediately from Eq. (18) (or the equation 
Xf AXj - XjXXBXj if A and B are real symmetric matrices) and the fact, guar- 
anteed by Theorem 14, that XXBXj = 0 (or XXBXj = 0). 

THEOREM 15 

If A and B are hermitian (or real symmetric matrices), if B is either positive- 
definite or negative-definite, and if ' X\, X s , . . . , X* are characteristic vectors 
of (A. — \B)X - 0 corresponding, respectively, to the distinct characteristic values 
Xi, Xs, . . . , Xi, then Xi, Xa, . . . , X* are linearly independent. 

PROOF Let A and B be hermitian matrices; let B be definite; and let us 
suppose, contrary to the theorem, that the characteristic vectors Xi, X%, ... , Xt 
corresponding respectively to the distinct characteristic values Xi, Xi, ... X* 
of (A — B)X = 0 are linearly dependent. Then there exists a relation of the 
form 

ciXi 02 X 2 +■■■-(- CkXk = 0 

in which at. least one of the c’s, say d, is different from zero. Now, if we multiply 
the last equation through on the left by XAB,^ we get 

ciXABXi + c t Xi T BX* + • • • + CiXXBXi + • • * 

+ c k Xi r BXk = 0 

However, from the orthogonality guaranteed by Theorem 14, it follows that 
every term in the last equation except dXXBXi is equal to zero. Moreover, by 
hypothesis, B is either positive-definite or negative-definite. Hence, XXBXi 5 ^ 0, 
and, therefore, d = 0, contrary to the assumption of linear dependence. This 
contradiction shows that the X’s must be linearly independent, and the theorem 
is established. 

The last theorem must not be misinterpreted as asserting 
that if A and B are hermitian (or real symmetric) {n,n) matrices 


f This procedure suffices when A and B are real, symmetric matrices, 3 s 
well as when they are hermitian because, although Theorem 14 docs not 
assert it explicitly, it is clear that X> T BX j = 0, i j must also hold for 
real symmetric matrices, since these are just special cases of hermitian 
matrices. 


490 


FURTHER PROPERTIES Of MATRICES 


CHAP. 11 


and if B is definite, then (A — \B)X = 0 has n linearly inde- 
pendent characteristic vectors. It guarantees that characteristic 
vectors corresponding to distinct characteristic values of 

(A - \B)X = 0 

are linearly independent, but it says nothing about how many 
distinct characteristic values there are or about how many 
independent characteristic vectors correspond to a repeated 
characteristic value. If, because of repeated roots, 

(A - \B)X = 0 

has fewer than n distinct characteristic values, then, for all we 
know at present, (A — \B)X ■== 0 has fewer than n linearly 
independent characteristic vectors. However, by a proof very 
much like the proof of Theorem 10, the following result can be 
established: 

THEOREM 16 

If A and B are hermitian (cr real symmetric) matrices and if B is either positive- 
definite or negative-definite, then to a repeated characteristic value of 
(A - \B)X = 0 

of multiplicity r there correspond exactly r linearly independent characteristic 
vectors. 

With this theorem available, it is not difficult to establish 
the following counterpart of Theorem 11: 

THEOREM 17 

If A and B are hermitian (or real symmetric) (n,n) matrices and if B is either 
positive-definite or negative-definite, then the equation (A — \B)X — 0 has 
exactly n linearly independent characteristic vectors. 

By a straightforward application of the Schmidt orthog- 
onalization process applied to the n linearly independent char- 
acteristic vectors of (A — \B)X — 0 guaranteed by Theorem 
17, we can establish the following useful results: 

COROLLARY 1 

If A and B are hermitian (or real symmetric) (»,«,) matrices and if B is definite, 
then (A — ~hB)X - 0 possesses n characteristic vectors orthogonal with respect 
to B. 

COROLLARY 2 

If A and B are hermitian (or real symmetric) (n,n) matrices and if B is positive- 
definite, then (A — \B)X - 0 possesses n characteristic vectors orthonormal 
with respect to B. 

With Theorem 17 and its corollaries available, it is now an 
easy matter to express an arbitrary vector C with n components 
as a linear combination of the characteristic vectors of the equa- 



SEC. 11.2 


THE CHARACTERISTIC EQUATION OF A MATRIX 


491 


( 20 ) 


tion (A — \B)X = 0, provided A and B are hermitian and B 
is definite. For we can write 


C = cxXi + C 2 X 2 + ■ • • + c„X„ 


where the X’s are characteristic vectors of (A — \B)X — 0 
mutually orthogonal with respect to B. Then, if we premultiply 
Eq. (17) by Xi T B, we obtain 

XfBC = cxXi T BXi +•'••+ CiXXBXi + • • • + c n Xi T BX n 


From the orthogonality of the X’s, it follows that every term 
on the right except aXfBXi is equal to zero. Moreover, since B 
is definite, it follows that XVTBX,- 9 * 0. Hence, we can solve 
for d, getting 
X?BC 

a = — 1 — 1. 2 n 

Xi T BXi 


If the X’s have been normalized with respect to B, that is, if 
XfBXi — 1 (i = 1, 2, . . . , n), the last formula reduces to 
the simpler expression 
a — X.i T BC i — 1 ,2, ... ,n 

The fact, that we were able to solve for the coefficients in the 
expression (20) without solving any simultaneous equations 
should make clear the great convenience of working with a set 
of vectors which are orthogonal. 


EXERCISES 

1 Find the characteristic values and the corresponding characteristic vectors of each of the 
following matrices: 


a 

4 

6 

6 

b 

7 -2 

-4 II c 

11 -4 -7 


1 

3 

2 


3 0 

-2 

7 -2 —5 


-1 

-5 

-2 


6 -2 

-3 I) 

10 -4 -6 

d 

4 

6 

6 

e 

1 1 1 

f 

—4 5 5 1| 


1 

3 

2 


1 3 3 


—5 6 5 


-1 

-4 

-3 


2 1 4 


—5 5 6 1| 


For which of these, if any, are the characteristic vectors orthogonal? 

2 Find the characteristic values and the corresponding characteristic vectors for the equation 
(A - -KB)X = 0, if 


8 

-2 

0 


8 

0 

0 

-2 

3 

-1 

B = 

0 

2 

0 

0 

-1 

2 


0 

0 

2 

2 

-1 

0 


3 

0 

0 

-1 

2 

-1 

B = 

0 

4 

0 

0 

-1 

2 


0 

0 

3 

6 

-3 

0 


6 

0 

0 

-3 

6 

-3 

B = 

0 

4 

0 

0 

-3 

4 


0 

0 

4 

3 

-1 

0 


4 

0 

0 

-1 

1 

-1 

B = 

0 

1 

0 

0 

-1 

5 


0 

0 

4 


A = 


492 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


8 


4 

5 

6 

7 

8 
9 

10 

11 


12 

13 

14 

15 


In each case, verify all orthogonality relations, and, using orthogonality properties, express 


the vector V 


1 

2 as a linear combination of the characteristic vectors. 

3 


Find three solution vectors of the equation ( 4 — \B)X — 0 which are orthonormal with 
respect to B if 


I 7 1-1 

a 4 = 1 4-1 

I -1 -1 3 

.1 7-1-1 

b 4 - -1 4 1 

I -1 13 


"I: : ;!( 

1 6 0 Oil 
0 3 0 

0 0 2 II 


Prove Theorem 17. 

Prove Corollary 1, Theorem 17. 

In Corollary 2, Theorem 17, why is it necessary to restrict B to being positive-definite? 
Prove that iA is hermitian if A is skew-hermitian. 

If A is hermitian and U is unitary, prove that U T AU is hermitian. 

Under what conditions, if any, is it possible for every value of X to be a characteristic value- 
of the equation (4 — \B)X ~ 01 

Show by an example that, if A and B are indefinite, the characteristic values of (A — \B)X 
- 0 need not be real, even though A and B are real and symmetric. 

Show that, if either A or B is nonsingular, then AB and BA have the same characteristic 
values. Hence prove that there are no matrices A and B, with either A or B nonsingular, 
such that A B — BA = I. (These results hold even when both 4 and B are singular.) 
Show that the characteristic values of a real skew-symmetric matrix are either zero or pure 
imaginary. 

Prove that, if a (2,2) matrix has characteristic vectors which are orthogonal, it is symmetric. 
Prove that, if every characteristic value of a symmetric matrix is zero, the matrix is a null 
matrix. Is this true if the matrix is not symmetric? 

Show by an example that Corollary 1, Theorem 10, is false for symmetric matrices which 
are not real. 


11.3 

The transformation of matrices 

In previous sections we have already encountered the idea of 
the transformation of matrices. For instance, in Sec. 10.4 we 
defined equivalent matrices as matrices A and B connected by a 
relation of the form 

B — QAP P,Q nonsingular 

Again, in Sec. 11.1 we observed that, if the variables in a quad- 
ratic form X T AX are subjected to a nonsingular linear trans- 
formation X — PY, then the quadratic form becomes 
Y T BY where B — P T AP 

In particular, we observed that, if a nonsingular matrix P can 
be found with the property that B = P T AP is a diagonal matrix, 


SEC. 11.3 


THE TRANSFORMATION OF MATRICES 


493 


then, in terms of the new variables introduced by the substitu- 
tion X = P Y, the quadratic form X T AX becomes just a sum 
of squares. In this section we shall consider briefly the question 
of just when it is possible to transform an (n,n) matrix .4 into a 
diagonal matrix B by multiplying it on the right and on the 
left by suitable matrices P and Q. 

Because an equivalence transformation is simply a com- 
posite of elementary transformations, it is clear that, for any 
square matrix A, many pairs of nonsingular matrices P and Q 
can be found such that QAP is diagonal. In other words, we 
have the following result, which is little more than a restatement 
of Theorem 10, Sec. 10.4, for square matrices: 

THEOREM 1 

Any square matrix is equivalent to a diagonal matrix. 

COROLLARY 1 

If a matrix A of rank r is equivalent to a diagonal matrix B, then B has exactly r 
nonzero diagonal elements. 

A square matrix cannot in general be transformed into a 
diagonal matrix by a transformation more restricted than an 
equivalence. However, in many important special cases this is 
possible, as the following theorems show: 

THEOREM 2 

A square matrix is congruent to a diagonal matrix if and only if it is symmetric. 

PROOF The proof that, if A is symmetric, then it is congraent to a diagonal 
matrix was essentially given in our discussion of the Lagrange reduction in Sec. 
11.1. For, if A is symmetric, then it can be regarded as the matrix of a quadratic 
form, and the Lagrange reduction provides a linear transformation whose matrix 
P is a nonsingular matrix with the property that P T AP is diagonal. 

On the other hand, if A is congruent to a diagonal matrix D, then there 
exists a nonsingular matrix P such that A = P T DP. Furthermore, if this is the 
case, then the transpose of A is 

AT = (pi’DP)r = prprp 

However, I) T = D, since any diagonal matrix is obviously symmetric. Hence, 

A T = P T D T P = PLDP = A 

Therefore, A, being equal to its transpose, is symmetric, as asserted. 

From the nature of the Lagrange reduction it is evident 
that a symmetric matrix can be diagonalized by a congruence 
transformation in many ways, and among these there is always 
at least one which will simultaneously diagonalize a second given 
matrix, provided it is positive-definite. More precisely, we have 
the following highly important theorem: 


FURTHER PROPERTIES OF MATRICES 


THEOREM 3 

Let \i, X 2 , . . . , X* be the (possibly repeated) characteristic values of the equa- 
tion (A — \B)X - 0, where A and B are hermitian (or real symmetric) (n,n) 
matrices and B is positive-definite. Let Xi, X 2j , . . , X n be n independent 
characteristic vectors corresponding to Xi, X 2 , . . . , X„; and let the X’s be ortho- 
normal with respect to B. Let M be the matrix whose columns are the charac- 
teristic vectors Xi, X 2 , . . . . , X n ; and let D be the diagonal matrix whose diago- 
nal elements are the characteristic values Xi, X 2 , . . . , X n . Then M T BM = I, 
and M r AM = D. 

PROOF Let Xi, X 2 , . . . , X» be the characteristic values of the equation 
(A — \B)X - 0. Whether or not there are repeated roots among the X’s, we 
know, from Corollary 2, Theorem 17, Sec. 11.2, that there exists a set of char- 
acteristic vectors X h Xi, . . . , X n orthonormal with respect to B, that is, 
such that 

( 1 ) \=J. 

Now, writing the modal matrix M in partitioned form, for convenience, we have 


( 2 ) 


and 


(3) 


M = || Xi X 2 


M T BM - 


*»ll 


Xi T 

• ||5Xx BXi ■ 

■ ■ BX J| 

Xn T 

1 



Xi T BXt 

Xi T BX t 

Xi T BXi ■ 
ti T BXi • 

• • X\ T BX n 

• • ti T BX n 

X n T BX 1 

XJBXi ■ 

• • X n T BX n 


■ 1 by (1). 

Also, by premultiplying Eq. (2) by A and then using the fact that for each i 
the X’s are such that AX, — \iBXi, we have 


AM = ||AXx AXi • * 
= llXxBXx X 2 £X 2 
= 5|!XxXx X 2 X 2 • • 

= B||Xx Xi 
= BMD 


AX4 

■ ■ X.BX.II 

XnXJI 


X,||' 


x 2 

o 


o 


Therefore, by (3), 

M T AM = M T (BMD) = (M T BM)D = D 
which is the second assertion of the theorem. 


COROLLARY ] 

If A and B are hermitian (or real symmetric) matrices, if B is positive-definite, 
and if If is a matrix whose columns are characteristic vectors of (A — \B)X = 0 


SEC. 11.3 


THE TRANSFORMATION OF MATRICES 


495 


orthonormal with respect to B, then the substitution X — MY simultaneously 
reduces the hermitian (quadratic) forms X T AX and X T BX to Y T DY and 
Y T IY = Y T Y, respectively, where D is the diagonal matrix whose diagonal ele- 
ments are the characteristic values which correspond respectively to the column 
vectors of M. 

The conditions under which a square matrix can be diag- 
onalized by a similarity transformation are contained in the next 
theorem: 

THEOREM 4 

An (n,n) matrix is similar to a diagonal matrix if and only if it has n independent 
characteristic vectors. 


PROOF Let A be an (n,ri) matrix, and let us suppose first that A is similar 
to a diagonal matrix 

II 


D -- 


„o 

O dn, 

that is, let us suppose that there exists a matrix S with the property that 
(4) S-'AS = D 

If, for convenience, we write S in the partitioned form 
s = \\Si S n \\ 

we have, by premultiplying Eq. (4) by S, 

AS = SD 

Mn O 

O d», 

d nn S n \\ 


||A»Si AS, 


Hence it follows that 


AS n || - ||Si Sz • • • S n \\ ■ 

= lidiiSi dzzS, 


A Si = duSi - dal Si % = 1, 2, 4 . . , n 

which shows that X = Si is a characteristic vector corresponding to the char- 
acteristic value Ai = da of the equation (A — \I)X — 0. Thus the n columns of 
the transforming matrix S are characteristic vectors of the given matrix A. More- 
over, since the inverse of S exists, by hypothesis, it follows that |S| 0. Hence, 

by Theorem 10, Sec. 10.5, the n columns of S are linearly independent. Thus, the 
matrix A has n linearly independent characteristic vectors, and the necessity 
assertion of the theorem is verified. 

Suppose now that A has n linearly independent characteristic vectors Xi, 
Xz, , X n corresponding to the (possibly repeated) characteristic values 
Xi, X 2 , . . . , A„. Then, by hypothesis, 


AXi == A dXi = A iXi 


■■ 1 , 2 , . . . 


Now, let S be the matrix whose columns are the characteristic vectors X\, X,, 


FURTHER PROPERTIES OF MATRICES 


. . ,X„;i.e., letS be a modal matrix of A. Then, since the characteristic vectors 
are independent, by hypothesis, it follows that exists, and we can write 


AX„|| 

II X, 


Z.1I- 


x 2 

o 


o 


= S-'S 


S-'AS = S-UljXx X 2 • • 

= II AXi AX , 

— jS -1 IIXiXj X2X2 

= S - 1 IIXi x 2 • • 

,0 

O ' X, 

r "° 11 

II O ■ J 

Hence, A is similar to a diagonal matrix; and the sufficiency assertion of the 
theorem is also verified. 

Since every hermitian and every real symmetric matrix has 
n linearly independent characteristic vectors, it is clear that the 
last theorem contains the following important special result: 


COROLLARY 1 

Every hermitian and every real symmetric matrix is similar to a diagonal matrix. 

Using the Schmidt process, it is clear that if a matrix A has n 
independent characteristic vectors, it has, in fact, a set of n ortho- 
normal characteristic vectors. Moreover, as we saw in Exercise 12, 
See. 10.3, a matrix whose columns are orthonormal is an orthogonal 
matrix. Hence, taking the matrix S in Theorem 4 to be a matrix 
whose columns are orthonormal characteristic vectors of A, we 
have the following results: 

COROLLARY 2 

Every real symmetric matrix is orthogonally similar to a diagonal matrix. 

COROLLARY 3 

If a matrix is orthogonally similar to a diagonal matrix, it is symmetric. 

By essentially the same argument, the following companion results 
for hermitian matrices can be established: 


COROLLARY 4 

Every hermitian matrix is unitarily similar to a diagonal matrix. 


SEC. 11.3 


THE TRANSFORMATION OF MATRICES 


497 


COROLLARY 5 

If a matrix is unitarily similar to a diagonal matrix, it is hermitian. 

Although not every matrix is similar to a diagonal matrix, every 
matrix is similar to a triangular matrix. Specifically, in more ad- 
vanced texts* the following results are established: 

THEOREM 5 

Every square matrix is unitarily similar to a triangular matrix. 

THEOREM 6 

Let the characteristic values of an (n,n) matrix A be \i, As, . . . , A fc ; let the 
multiplicity of A; be nn) let n be the number of linearly independent characteristic 
vectors of A corresponding to A,; and let Di be the (mi, mi) upper triangular matrix 
in which the diagonal elements are all A*. the first urn — n elements on the diagonal 
above the principal diagonal are each 1, and all other elements are 0. Then the 
given matrix is similar to the matrix 

Di 

D = 

O I 

The standard, or canonical, form described in the last theorem 
is known as the Jordan canonical form.f 

Many of the theorems of the last two sections find their most 
immediate physical application in the analysis of vibrating sys- 
tems, either mechanical or electrical. In particular, the orthogonal- 
ity of the characteristic vectors of a matric equation make it 
possible to impose initial conditions of displacement and velocity 
on a system with a finite number of degrees of freedom in a way 
that resembles closely the corresponding procedure for boundary 
value problems involving continuous systems (Secs. 8.4 and 8.5). 
The following example illustrates these ideas. 

example 1 

The three masses shown in Fig. 11.1a are initially displaced so that 
(xi)o = 2 ( 2 : 2)0 = -1 (a-o)o = 1 

From these positions they begin to move with initial velocities 
(vi) 0 = 0 (vi) B = 2 (v 3 )o = 0 

Assuming that there is no friction in the system, determine the subsequent motion of each mass. 

Since friction is assumed to be negligible, the only forces acting are those transmitted to the 
masses by the springs directly attached to them. Now, when the instantaneous displacements of 
the masses are an, x«, and xt, the lengths of the springs have changed from their unstretclied, 


* See, for instance, L. Mirsky, “Linear Algebra,” p. 307, Oxford Book 
Company, Inc., New York, 1955. 

f Named for the French mathematician Camille Jordan (1838-1922). 


498 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


FIGURE 11.1 

A three-mass sys- ( G ) 
tern in equilib- 
rium and in a 
displaced posi- 
tion. 



Net changes in spring lengths (x u x s shown positive; x 2 shown negative) 


"i .** X 2 ~ X \ *3" *2 I 

I — mumn — -op- nnnnr ' — — - w>-fl 


(*) 




equilibrium lengths by the respective amounts (Fig. ll.lt) 

iCi 2* — xi Xa ~ 2j — x a 

Hence, the forces instantaneously exerted by the springs are, respectively, 

Sxi 3(x s — xi) 3(x s — x 2 ) ~x s 

where plus signs indicate that the springs are in tension, minus signs that the springs are 
in compression. Therefore, applying Newton’s law to each of the masses in turn, we obtain the 
three differential equations 


6 dl 1 
d 2 xs 


—3xi + 3(»j — xi) 

-3(22 — Xl) +3(28 — 2j) 


(5) 


(6D J + 6)2, - 32 s =0 

-32, + (4 D* + 6)2s - 32, = 0 

-32s + (4D 2 + 4)2, = 0 

', in matric notation, simply 
P(D)X - 0 


P(D) - 


6 D s + 6 

-3 

0 



2, 

-3 

4Z> 2 + 6 

-3 

and 

X = 

22 

0 

-3 

4tD* + 4 



23 


Since there is no dissipation of energy through friction, it is clear that each mass must 
vibrate around its equilibrium position with constant amplitude. Hence, as a solution we assume 


that is, 

2 , = a, cos <at zi ~ as cos «{ 2 » = a , cos <d 

where a is an unknown frequency and a,, as, and o 3 are the unknown amplitudes through which 



SEC. 11.3 


THE TRANSFORMATION OF MATRICES 


499 


the masses oscillate. Substituting these into the differential equations in (5) and dividing out the 
common factor cos erf, we obtain the three algebraic equations 

(— 6<o 2 + 6)ai — 3 o 2 =0 

(6) -3a, + (-4a>* + 6)o* - 3a, - 0 

— 3o* + (— 4w 2 + 4)a, == 0 

from which to determine ai, a 2 , and a a . This system will have a nontrivial solution if and only if 
the determinant of its coefficients is equal to zero. Hence, we must have 

I (— 6w 2 + 6) —3 0 I 

-3 (-4<o 2 + 6) -3 - — 6(4w* - l)(o> 2 - l)(4u 2 - 9) = 0 

I 0 -3 (~4« s + 4) I 

Thus the system (6) has a nontrivial solution for to 2 = 1, % and for no other values of w*. 

The natural frequencies of the physical system are, therefore, 


Now, according to Theorem 8, Sec. 10.5, the values of «i, a 2 , and a 3 which satisfy (6) when 
the determinant of its coefficients is equal to zero can be read from any (2,3) matrix of rank 2 
contained in the coefficient matrix. Hence, using the matrix of the coefficients of the last two 
equations in the set, we have, in the three nontrivial cases, 

" -3 .5 -3 II 


= H: 


0 -3 


3 


a* “ 3 CI 3 — 3 


2 

-3 


■ to- 


a 1 

a 2 

a% 


5 —3 1 = 

-3 

—3 I = | -3 

5 

f-3 3 1 

0 

3 1 1 0 

-3 

ai 

a 2 

_ a, 


| 2 — 3 | = 

-3 

—3 i ~ I -3 

2l 

| -3 o| 

0 

o| 1 0 

—3 

ai 

a 2 

0, 


-3 -3 

-3 

—3 1 -3 


-3 -5 1 

0 

—5 1 0 

“ 3 1 

tors for the system (5), namely, 


Ml 

1 2 

3 


O 

O 

li 

-5 

cos — l 


ill 

3 

2 



rn t 

*1 - 3 cos- X 

|| 3 II 2 

Clearly, if we had begun with the assumptions 

£1 = cii sin cot x 2 = a 2 sin cot x 3 = a, sin cot 
we would also have obtained the algebraic equations (6) and, hence, the same three values of a 
and the same solution vectors. Therefore, we have three more particular solutions: 


- 1 


and, finally, the complete solution 

(7) X = C 1 X 1 + C 2 X 3 + c 3 X 3 + c*Xi + C 5 X 5 + ctX 3 
where the c’s are arbitrary scalar coefficients. 

To determine the values of the c’s we must, of course, use the given initial conditions. The 
most convenient way to do this is to write the system (8) in the form 

(8) ( V - w *T)A = 0 


2 t 

1 

2 1 

3 sin - 

X 3 = 0 sin t 

X, = -5 

3 2 

-1 

3 I 


500 


FURTHER PROPERTIES OF MATRICES 


CHAP, II 



6 

-3 

0 

6 

0 

0 

ai 

where 

V - -3 

6 

-3 

T - 0 

4 

0 

A = a 2 


0 

-3 

4 

0 

0 

4 

a 3 


and then recall from Sec. 11.2 that the solution vectors of (8), namely, 


At- | 3 | A.- | 0 JJ A»- | -5 | 

satisfy the orthogonality condition 
(9) Ai*TAy - 0 i 

To take advantage of this property, we first set f — 0 in (7) and substitute the initial displace- 
ment. vector for X(0), getting 


(10) 

2 

-1 -Cl 

2 

3 + Cs 

1 

0 + cj 

2 

-5 


1 

3 

-1 

3 


Then, if we multiply this equation through on the left by 

II 6 0 0 1| 

AST m ||2 3 3|| • 0 4 0 - ||12 12 12|| 

II 0 0 4 1| 


the second and third terms on the right vanish because of the orthogonality property (9), and 
we have simply 

||12 12 12 1| • | -1 J = ci || 12 12 12 1 | • J| 3 J| or Cl * H 

Similarly, multiplying (10) on the left by 


AST - |)6 0 — 4|| and by 

in turn, we find 


AST - ||12 -20 


12 || 


c* - % Ci - J^O 


To find ct, c e , and c 9 we first differentiate Eq. (7), getting 


2 

t 

1 

. , 3 

8inl--ci 

2 

3 

3 

| sin- -c 2 

0 

-1 

-5 

3 

2 

t 

T 

3 

2 

3 

3 | 

|cos-+c s 

0 

-1 

cos t + - c t 

-5 

3 


Then, setting t — 0 and replacing —— I 
dt \t~ 


by the given initial velocity vector 


( 11 ) 


I I 2 1 2 

2 — lid 3 + c 8 0 + %Cfs —5 

0 3-1 3 


1 

2 , we have 
0 


Finally, multiplying this equation on the left by 

AST = ||12 12 12|| A**T«||6 0 -4|| and AST - ||12 -20 12|| 

in turn, we find 

Ci — % d, - % and c« = — %o 
With the c’s determined, the solution is now complete, and we have 

X - HX i + HX* + }i Q X 3 + HXi + HXi - JioX, 


SEC. 11.3 


THE TRANSFORMATION OF MATRICES 


501 


or, explicitly, 


, 7 . 3 ; 

+ - sin - t 
8 2 


3 t 4 21 

= - cos - — - cos t c 

4 2 5 20 


3 9. t 

i 2 i + i"‘5- 


3 . 
-sm 


We have already identified the three values to = X, % as the natural frequencies of the system, 
i.e., the only frequencies at which free vibrations of the system are possible, and we have illus- 
trated how the motion produced by an arbitrary set of initial conditions involves simultaneously 
vibrations at each of the natural frequencies. The vectors At, A s , and A,, associated, respectively, 
with the frequencies to = } i, to = 1, and to = %, are called the normal modes of the system. Each 
describes the relative amplitudes with which the three masses would vibrate if the system were 
set in motion in such a way that it vibrated only at the corresponding natural frequency. The 
absolute amplitudes depend upon the c’s, of course, and so are determined by the initial conditions, 
but at each natural frequency the ratios of the amplitudes with which the masses oscillate are 
always the same, regardless of their actual numerical values. Figure 11.2 illustrates this behavior 
for one full cycle of the motion at each of the three natural frequencies. 

To conclude our discussion, let us now apply to this problem the results of Theorem 3, 
Sec. 11.3. To do this, we return to the matric equation (8), namely, (F — u 3 T)A = 0, and 
observe that F and T are both symmetric and that T is positive-definite. Hence, the hypotheses 
of Corollary 1, Theorem 3, Sec. 11.3, are fulfilled; therefore, if A*, A*, A* are solution vectors of 
Eq. (8) orthonormal with respect to T and if M is the matrix ||d* A* ^4* |j, the substitution 


X = MY 


will simultaneously reduce the quadratic forms X T VX and X T TX to the respective diagonal 
forms Y T DY and Y T Y, where D is the diagonal matrix of characteristic values 

H o 0 II 
0 10 
o o H II 

To verify this, we note first that, when the solution vectors 


are normalized with respect to T, we obtain 


At ■ 


Vast Ax V 96 


Hence, the required substitution X = MY is 

2 1 2 
V96 VTO VH50 
3 -5 

\/96 VlfiO 

3 -1 3 

V96 Vio Vieo 


A 2 1 „ 

• - - — ==. = —7= 0 

VaSTA , V10 ■, 


At - 


VAz t TAs V 160 


502 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 





Relative amplitudes 2:3:3 Relative amplitudes 2 : - 5 : 3 

FIGURE 11.2 

The normal modes of the system shown in Fig. 11.1. 


(12) 


o. - 2yi ■ v* ■ 2 y» 

V 96 V 10 v 160 

3yi 5 yt 

V96 V^60 

_ Syi Ml t Zy * 

VIo Vrn 

Finally, introducing these expressions into the quadratic forms 

X T VX = 6x! s - 6zia: 2 + 6m 2 2 - Qxix, + 4x 3 * and X T TX = 6a;! 2 + 4s 2 ! + 4z a 2 




SEC. 1 1.3 


THE TRANSFORMATION OF MATRICES 


we obtain, respectively, 


X*VX = 6 

\v 96 V 10 v 160/ 

_ q( 2y l_ , y 2 + _ 52/3 \ 

\\/96 VlO \/l60/ \V96 s/im) 


' 3 yi 


5 V 


6 (a/96 Vm) 

Q ( 3 yi _ 5//3 \ / 3//I 3y 3 \ 

\\/96 \/i60/ \%/96 \/w Vl60/ 


*(*- 


i/g_ + 3 yt V 

VlO '\/l60/ 


XTTX - 6 (~%L +-£=+-%!■■ V + 4^-^-— ^L-V 
\\/ 96 \/l0 \/l60/ \\/ 96 Vl60/ 


/ 3y^ ] 3y^ V 

W96 V 10 Vl60/ 


In the present problem it is easy to identify the two quadratic forms X T VX and X T TX. In 
fact, since the energy stored in a spring stretched a distance s is fcs a /2, it follows that the instan- 
taneous potential energy in our system is 

H[3®i* + 3 (x 2 — Xi)* + 3(x s — X2) 2 + xi 2 ] = M[6an 2 — 6 xix 4 + 6x a 2 — 6x2X5 + 4x 3 2 ] 

= y 2 x*vx 

Hence, X T VX is equal to twice the instantaneous potential energy of the system. 

Also, the kinetic energy of a mass moving with velocity v is mv*/2. Hence, the instantaneous 
kinetic energy of our system is 

M(6xi 2 + 4 x2 2 + 4x 3 2 ) - Y 2 ± t T± 

From this it follows that, when the system is vibrating at any one of its natural frequencies to,-, 
its maximum kinetic energy is 


~X T TX 

The new coordinates yt, yt, ijs, defined by (12) and in terms of which the two energy 
expressions appear as sums of squares, are known as the normal coordinates of the system. 


EXERCISES 

1 For each of the following matrices A, find two pairs of nonsingular matrices (P,Q) such that 
PAQ is a diagonal matrix: 


II 1 2 II 

b 

II 1 “Ml 

c 1 

-1 

Ml 

d 

1 0 

3 

II 3 4 || 


II 0 3 1| 

2 

1 

2 


1 -1 

1 




0 

1 

3 H 


-1 3 

3 


504 


FURTHER PROPERTIES OF MATRICES 


CHAP, n 


For each of the following matrices .4, find two nonsingular matrices P such that P T AP is a 
diagonal matrix: 


|| 1 2 II 

b || 1 ~1|| 

c 1 1 

1 

d 1 2 

0 

II 2 3 1 

II -i o| 

1 2 

0 

2 5 

2 



1 0 

3 

0 2 

4 


8 For each of the following pairs of matrices ( A,B ), find a congruence transformation which 
will simultaneously reduce A and B to diagonal form, and carry out the diagonalization: 


1 


8 0 
0 2 
0 0 

1 

3 

-1 


::i 

3 • 
-1 
0 

7 : 
-1 
0 


-2 

10 


a If A and B are hermitian (or real symmetric) matrices, show that there may exist con- 
gruence transformations which will simultaneously diagonalize A and B even though B is 
not definite. 

I I — *2 1 1| 
ii 

Hlo -2 1 

c Find a congruence transformation which will simultaneously diagonalize || ® j| 

and II 


'1 -51 


Find similarity transformations which will reduce each of the following matrices to diagonal 
form; 

5 111 


a 

-3 2 1| b 

0 -2 1| c 


-10 6 1| 

-2 0 1| 

d 

5 -2 -1|| e 

2 -3 3 1| f 


-1 4 -1 

0 3-1 


1 -2 3 1| 

0-1 3 1| 


6 Work Example 1 with X 0 


The system shown in Fig. 11.3 begins to move with initial displacement Xo = 
ki = l m t — 1 k xz ~2 m z - 2 ft 2 = 2 fe 



initial velocity -£ 0 = 
subsequent motion. 


. Assuming that there is no friction in the system, determine its 


SIC. 11.4 


FUNCTIONS OF A SQUARE MATRIX 


SOS 


8 The system shown in Fig. 11.4 begins to move with initial displacement - jj 1 jj and 

initial velocity Xo = JJ 0 JJ, Assuming that there is no friction in the system, determine its 
subsequent motion. 



subsequent motion. • 



11.4 

Functions of a square matrix 

In Sec. 10.2, after we had defined matric multiplication, we were 
able to define positive integral powers of a square matrix A and 
to verify that, for arbitrary positive integers r and s, 

A r A* = A*A r = A r +* 


( 1 ) 


506 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


Moreover, we verified in Sec. 10.3 that, if A is a nonsingular 
matrix, it has an inverse A~ r such that AA~ l — A~ l A — I, and 
we defined negative integral powers of A by the relation 

A-»=(A~')« 

Thus, after we introduced the definition A 0 — I, it became clear 
that, for any nonsingular matrix A, Eq. (1) holds for all integral 
values of r and s. It is now natural to define polynomial functions 
of a square matrix and, if possible, rational fractional functions: 

DEFINITION 1 

A. polynomial function of a square matrix A is a finite linear combination of non- 
negative integral powers of A, 

p(A) = do A n + atA n ~ l .* * * — J— ~b a n I 

EXAMPLE i 


If ^ ~ 1 3 -4 1 ,nd 

then 

p(A) =A* + 5A +4/- | _g 

It is interesting to note that p(A) can also be evaluated by using the factored forms of p(x), 
namely, 

V(x) - (a + 4)(a + 1) - (x + l)(x + 4) 


p(x) = x* + 5x + 4 


, M +4 W+ /,.(!' — 4 1 1 ^ || o :|IKII: jHIo 

'la si'i: Ji-ir: :il 

pW . M+ 7,(4 + 47,.(||j _J|+|; “I) -(I* j|+*|j 

Hi: JlNi: ii-ls :i 


I) 


I) 


In this example it is of course not clear, especially in view 
of the noneommutative character of matric multiplication, 
whether the fact that p(A) can be computed equally well from 

A* + 5A+4I ( A+4J)(A+1 ') and (A + I) (A + 47) 

is a result of some special property of A and p{x) or is illustrative 
of some general principle. Actually the latter is the case; in fact, 
any identical relation involving sums and products of scalar 
polynomials is valid for the corresponding matric polynomials, 
as the following important, but almost obvious, theorem assures us : 


SEC. 11.4 


FUNCTIONS OF A SQUARE MATRIX 


507 


THEOREM 1 

Any polynomial identity between scalar polynomials implies a corresponding 
identity for matric polynomials. 

PROOF Clearly, any polynomial relation between scalar polynomials can be 
constructed using only the operations of addition and multiplication. For instance, 

+h(*m(x) =mx) 

is completely equivalent to the chain of relations 

= / B (a :) (j>(x) - \p(x) + f-fx) $(z) = ji(x)f 2 (x) 

Hence, to prove the theorem it is sufficient to show that, for any polynomials 
/, g , s, p and any square matrix A, 

a If f(x) + g(x) = six), then /(A) + g(A) = s(A) 
b If f(x)g(x) = p(x), then f(A)g(A) = p(A ) 

To prove the first of these, let 

m n t 

Six) = JJ aw 1 g(x) = 2 «(*).= X CiX 1 

t=0 t = 0 i = 0 

where 2 = max (m,n), c; = Oi 4 6», and the coefficients of any powers of x which 
are not present are understood to be zero. Then 

/(A) 4- g(A) = V a,A* 4- X M* 

i=0 i=0 

t t 

= X ( a » 4 6i)4‘ = X c t A*' = s(A) as asserted. 

i=0 i=0 

To prove part b, let 

fix) = X ^ 0(s) = X bjX ’ = X ^ 

i=0 J=0 fc = 0 

where £ = m 4 n and c k = X the summation extending over all values of 

<,y 

f and j such that i 4 j = k and, of course, Q £ m f 0 £ j £ n. Then, using 
the distributive property of matric multiplication and the associative and com- 
mutative properties of matric addition, we have 

}(A)g(A) = ( | aji‘) ( V M>) 

i=0 i=o 

= 2 2 =12 •*?»*" 

1=0 i=0 i=0j=0 

or, grouping together all terms involving the same power of A, 

< = wi+n 

fiA)g(A) = X CjtA* where /c = i 4 i and c k — X fl£ ^' 

= p(A) as asserted. 

Since fix)g ix) = gix)fix), it follows from Theorem 1 that 
(2) fiA)giA) = giA)fiA) 

In other words, we have the following important result : 


508 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


COROLLARY } 

Any two polynomials in a matrix A commute with each other. 

If gr(A) is a nonsingular matrix, then g~ 1 (A) exists, and we 
may premultiply and postmultiply each side of Eq. (2) by 0 -1 (A), 
getting 

g-'(A)HA) 9 (A)g-'(A) = g-'{A)g(A)f(A)g-KA) 

or 

(3) rWU) =/(A)rHA) 

With this identity, we are now in a position to define rational 
fractional functions of a square matrix A : 

DEFINITION 2 

If f(x) and g(x) are scalar polynomials and if A is a square matrix such that g(A) 
is nonsingular, then either of the equal matrices g-' l (A)f(A) md f (A) g~ x (A) is 
called the quotient of /(A) by g(A) and is written/(A)/g'(A). 

It is now relatively easy to prove the following extension 
of Theorem 1 (see Exercise 3) : 


THEOREM 2 

Any identity between rational fractional functions of a scalar variable implies a 
corresponding matric identity, provided all the matric functions are defined. 

With rational functions of a square matrix now defined, it is 
natural to ask whether the characteristic values of a rational 
function of a matrix A can be expressed in terms of the charac- 
teristic values of A. This is indeed the case, as the following 
chain of theorems makes clear: 

THEOREM 3 

If Xi, Xj>, . . . , X„ are the (possibly repeated) characteristic values of a square 
matrix A and if / is any polynomial, then 

l/W)l = /(x.)«x,) • • -/(x.) 

PROOF Let the characteristic polynomial of the given matrix A be 

(4) I A - A/| = n (X, - X) 

»=i 

and let the factored form of the given polynomial / be 

(5) f(t) - c(t ~n)(t — n) • * • ( t - r*) 

Then, since Theorem 1 assures us that identities between scalar polynomials 
imply corresponding matric identities, we have 

/(A) = c(A - nI){A - nl) • • • (A - r k I) 

Furthermore, since the determinant of a product of square matrices is equal to 
the product of the determinants of the matric factors, and since the scalar factor c 


SEC. 11.4 


FUNCTIONS OF A SQUARE MATRIX 


509 


incorporated into any one of the matric factors reappears as the factor c n in the 
determinant of that matrix, we have 

k 

(6) \f(A)\ = c"| A - rJ\-\A - rj\ • • • \A - n I\ - c* R \A - r y 7| 

y=i 

However, | A — r } I\ is just the characteristic polynomial of A evaluated for 
X — ?'j. Hence, by (4), 

| A - r,l\ - n »< - r,) 

*=1 

and, therefore, substituting into (6), we have 

kn 

\m\ - «■ n n »< ->■,•) 

y=i »=i 

Next, interchanging the order in which the products are formed by first grouping 
together all the factors corresponding to a given value of i and assigning a single 
factor c to each such group, we have 

nk n k 

\f(A)\ n n (x» - h) = n [ c n & - ?v>] 

i-U-l i=*l J = 1 

Finally we observe that, with the coefficient c, the inner product in the last expres- 
sion is precisely the evaluation of the factored form (5) of the given polynomial 
for t — Xi. Hence, 

i/w)i - n /(X») as asserted. 


THEOREM 4 

If Xi, X 2 , . . . , X„ are the characteristic values of a square matrix A, if; = g/h 
is a rational fractional function, and if |/i(A)| is different from zero, then 

m)\ = /(Xi)/(x 2 ) • • • /(x„) 


PROOF Since, by definition, /(A) = g{A)/h{A) — g(A)h~ l (A) and since the 
determinant of a product of square matrices is equal to the product of the deter- 
minants of the matric factors, we have 

|/(A)| - \g(A)hr'(A)\ - |gf(A)| \hri(A)\ 

Moreover, as we observed in Sec. 10.3, |A- :1 (A)| = l/\h(A)\. Therefore, 


\m\ - 


M41i 

\h(A)\ 


However, by Theorem 3, since g and h are polynomials, 


l?(A)| = g(\i)g(\i) 


\.f(A)\ 


■ g(K) 
. g(Xi)g(x 8 ) 


/X (X i)/i (X2) 
= /(M)/(X 2 ) ' 


and 

g(Xn) 


• A(X„) 

• /(X«) 


|A(A)| = A(X0A(X») • • ' h(\ n ) 


Hence, 


as asserted. 


510 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


THEOREM 5 

If Ax, X 2 , . . . , A n are the characteristic values of a square matrix A and if 
/ = g/h, where g and h are polynomials such that |/i(A)j ^ 0, then the charac- 
teristic values of f(A) are /(Xi),/(X 2 ), • • • ,/( A n ). 


PROOF Let 

4>(x) = f(x) ~ X 


g(x) _ x g(x) - \h(tc) 
h(x) h{x) 


Clearly, g(x) — \h(x) is a polynomial, and, therefore, <f>(x) is a rational fractional 
function of x. Hence, by the last theorem, 

|<£(A)| = • • ■ <KA») 

In other words, for all values of A, 


|/(A) - X/ 1 = [/(Xx) - X][/(X 2 ) - X] • • • [/(X n ) - X] 


The right-hand side of this identity is, thus, the factored form of the characteristic 
polynomial j f(A) — X/| of the matrix f(A); hence, the roots of the characteristic 
equation of f(A) are 


x=/(Ax),/(x 2 ), • • • ,/(Xj 


as asserted. 


COROLLARY 1 

If the characteristic values of a matrix A are Xi, X 2 , . . . , X», then for all integral 
values of k if A is nonsingular and for all nonnegative integral values of k if A is 
singular, the characteristic values of A k are Xi*, X 2 *, . . . , X»*. 

COROLLARY 2 

If Xi is a characteristic vector corresponding to the characteristic value Xi of a 
square matrix A and if p is a polynomial, then Xi is also a characteristic vector 
corresponding to the characteristic value p(X<) of the matrix p(A). 

EXAMPLE 2 

As an illustration of Theorem 5, consider the matrix A = jj ^ ^ Jj and the function <{>(x) = 

x/(x + 3). The characteristic equation of A is ^ 

U ~ ATI - | 1 “ X _ ~ 2 _ x | - X s + 3X + 2 « 0 

Hence, the characteristic roots are X = —1, —2. Therefore, according to Theorem 5, the char- 
acteristic roots of tji(A) are 

4>( — l) = — and <£(— 2) = —2 
To confirm this, we have, by direct calculation, 

«»-Thr,- AiA+sr >-'- ||1 



SEC. 11.4 


FUNCTIONS OF A SQUARE MATRIX 


511 


The characteristic roots of <j>(A) are, therefore, the roots of the equation 

ItfOD - X/| = | H ~ X x | = X* + + 1 = 0 

or — }-2 and —2, as before. 

If p is a polynomial and A is a square matrix, the evaluation 
of p(A) is a perfectly straightforward matter. However, when 
A is a matrix similar to a diagonal matrix, the evaluation of 
p(A) can be appreciably simplified. To establish the result upon 
which this simplification is based, it is convenient first to prove 
the following lemmas: 


LEMMA I 

If B = S^AS, then B n = S' 1 A-S. 

PROOF Clearly, the lemma is true for n = 2, since 

B 2 = (S' 1 AS) (S' 1 AS) = S-'AiSS-^AS = S' 1 AS 
Assuming, then, that the lemma is true for n = k, we have 

= BB k = (S' 1 AS) (S' 1 AS) = S'^SS'^AS = S' 1 A k S 
which completes the induction and establishes the lemma. 

If we now apply Lemma 1 to each term of any polynomial 
function of B and then use the distributive property of matric 
multiplication, we obtain the following result: 


LEMMA 2 

If B = and if p is a polynomial, then p(B) = S~ 1 p(A)S; i.e., 

p(S' 1 AS) = 

Furthermore, by another easy induction we can establish the 
following observation: 


LEMMA 3 

If D is the diagonal matrix 



Finally, by applying Lemma 3 to each term of any poly- 
nomial function of a diagonal matrix D and then using the 
definition of matric addition, we have the following result: 


LEMMA 4 

If D is the diagonal matrix 


da d O 

«22 


o 


512 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


and p is any polynomial, then 



P(d n ) 

O 

P(D) - 

p(dn) 


O 

P(dn n) 


Using Lemmas 2 and 4, we can now prove the following 
useful theorem: 

THEOREM 6 

If a matrix A is similar to a diagonal matrix; i.e., if 

x > o 

S-'AS - D = 

O Ml 

where X i; X 2 , . , . , X» are the characteristic values of A, then 

p(Xr) O II 

pM w I s -, 

o 


p(A) - S 

PROOF By Lemma 4, 


P(X») II 




o 

P(D) = 

p(X 2 ) 


O 

p(K) 


Also, since S~ 1 AS = D, it follows that A «= SDS' 1 . Hence, using Lemma 2, we 
have 


S' 1 = Sp(D)S~ 1 - pOSAS" 1 ) = p(A) as asserted. 


p(Xx) 

P(U) 

o 

o 

p(Xn) 


EXAMPLE 3 

If p(a:) = — 4x a + 6z 2 — z — 3 and -4=1 


, what is p(A)? 


By an easy calculation we find the characteristic equation of A to be 


|d - X/| = 


— X -2 
1 3 — X 1 


= X 2 - 3X + 2 = 0 


Hence, the characteristic values of A are Xi = 1 and M = 2; and, since these are distinct, it 
follows from Theorem 4, Sec. 11.3, that A is similar to a diagonal matrix and Theorem 6 can be 
applied. Now, corresponding to X t and X 2 we have the characteristic vectors 

Xi 'll -ill “ nd HU II 

and from these we can construct the modal matrix 

S = || 2 Ml and its inverse S~ l = || 1 M| 


SEC 11.4 


FUNCTIONS OF A SQUARE MATRIX 


513 


According to Theorem 4, Sec. 11.3, these are matrices such that 
S-*AS = D * 

Hence these are the matrices to be used in evaluating p(A ) by means of Theorem 6. Now, 


1 0 I 
0 2 


Therefore, 


p(Xi) = p(l) - -1 and p(X,) = p(2) - 3 


p(A) - A* - 4A 3 + 6A S - A - 31 = S || ,° | S~i 

II 0 2>(X S ) | 


J 2 1 -II" 1 0 1 - 1 1 Ml 

II -1 — 1 1| || 0 3 || || -1 -2 II 

1 -5 - 8 1 | 

4 7 1 | 

After polynomial functions of a square matrix have been 
defined, it is natural to consider polynomial equations in a matric 
variable. In particular, now that we have developed procedures 
for evaluating p(A), that is, solving the equation p(A) = X, we 
shall consider the problem of solving the nontrivial equation 
p(X) — A, where p is a given polynomial, A is a given square 
matrix, and X is- a matric variable. By means of examples (see 
Exercise 1) it is easy to show that there are polynomial equations 
p(X) — A which have no solution. In one important case, how- 
ever, the equation p(X) = A can always be solved, as the follow- 
ing theorem makes clear: 

THEOREM 7 

If A is similar to a diagonal matrix and if p is a scalar polynomial, the equation 
p{X) = A is solvable for X. 

PROOF By hypothesis, since A is similar to a diagonal matrix D, there exists 
a nonsingular matrix S with the property that »S~ l AS = D, or A — SDS -1 , 
where, say, 

* O 

D = dn 

O 4-11 

Now, let n be one of the roots of the equation p(x) = da. Then, if 

II O -II 

we have, by Lemma 2, noting that S = (S -1 ) -1 
p(X) = p(SRS - 1 ) = Sp(R)S^ 


and X = SRS 1 


514 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


Moreover, by Lemma 4 and the fact that p(r») = da, 
p(R) 

Therefore, p(X) = SpiR^S' 1 = SDS~ l = A 


pin) O 

Pin) W 


*. O 

O pO») 


O <*«> 


which proves that, if A is similar to a diagonal matrix, then p(X) — A has the 
solution X — SRS~ l . If the polynomial p is of degree k, the scalar equation 
p(x) = da has, in general, k distinct roots. Hence, there are k distinct choices 
for each of the n diagonal elements in R, and, therefore, p(X) — A has, in general, 
at least k n different solutions. 

By applying the preceding theorem to the particular equa- 
tion X 2 = A, we obtain the following corollary: 


COROLLARY 1 

An ( 71 , n) matrix with distinct characteristic values has at least 2 n or 2 n ~ 1 distinct 
square roots, according as it is nonsingular or singular. 


PROOF Let A be an (n,n) matrix with n distinct characteristic values Xi, X 2 , 
. . . , X n . It follows, then, by Theorem 4, Sec. 11.3, that there exists a nonsingular 
matrix S such that 


A = S 


x ’ x, O 


o 


S 


Thus, according to the last theorem, for any choice of plus and minus signs, 

I ± vx; _ O 


X = s 


± \/Xi 

O ± vt; 


,5-1 


satisfies the equation X 2 = A. If A is nonsingular, none of the X’s is zero, and 
there are 2" combinations of signs each leading to a different matrix X satisfying 
the equation X 2 = A. On the other hand, if A is singular but still has distinct 
characteristic values, then, by Theorem 5, Sec. 11.2, one and only one of the X’s 
must be zero, and, therefore, for one of the diagonal elements there is only a 
single choice rather than two. Hence, in this case there may be no more than 
2 n “ 1 distinct square roots, as asserted. 


EXAMPLE 4 


Solve the equation X 2 — 4X -f- 47 = || ^ JJ 

The characteristic equation of the matrix A 

4 - A 3 I 
5 6 — X I 



- x* - 10X + 9 = 0 


SEC. 11.4 


FUNCTIONS OF A SQUARE MATRIX 


SIS 


Hence, the characteristic values of A are \i = 1, Xj = 9, and the corresponding characteristic 
vectors are Xi - JJ __ J jj X 2 = JJ 3 JJ Therefore, by Theorem 4, Sec. 11.3, A is similar to a 
diagonal matrix ; that is, 

S~ l AS — D or A — SOS' 1 

where S is the modal matrix jj j 5 |J Jj ^ ^ JJ and D ~ |j * jj JJ We must now 

solve the equations p(x) = da for r,- (i = 1,2): 


x 1 — ix + 4 = du =* 1 x* — 4x + 4 = dzi — 9 

z — n ~ 1 , 3 x — r 2 ~ — 1 , 5 


Pairing each possibility for ri with each possibility for r 2 , we thus obtain four possibilities for 
the matrix R : 


1 0 II 

«.-ll‘ 1 

*3= || 3 I! 

£0 

1! 

0 

0 -1 II 

11 0 5 || 

II 0 ~1 II 

II 0 5 || 


Then, according to Theorem 7, the solutions of the given equation are 


Xi - SR1S- 1 


and, similarly, 


1 3 
-1 5 I 


: n 


1 0 
0 -1 




1 -3 
-5 -1 


= SRzS- 1 = Yi 
X , 


-5 1 || 

SRtS-i - U I 


15 3 

5 17 


EXERCISES 

1 Prove that there is no matrix which satisfies the equation X s = 

2 Show that, for particular polynomials and particular matrices, each of the following cases is 
possible: 




a A nonsingular, p(A) nonsingular b A nonsingular, p(A) singular 

c A singular, p(A) nonsingular d A singular, p(A) singular 

S Prove Theorem 2. [Hint: Note first that it is sufficient to prove that 


fl(x) Mx) 
Mx) + Mx) 

Mx) 

implies 

MA) MA) 

MA) 

Mx) 

MA) MA) 

MA) 

Mx) Mx) 

Mx) 

• r 

MA) MA) 

MA) 

Mx) f*(x) 

Mx) 

imp les 

MA) ' MA) 

MA) 

Mx)/Mx) 

Mx)/Mx) 

Mx) 

Mx) 

implies 

MA)/MA) 

MAVMA) 

MA) 

MA) 


Then clear of fractions in the scalar identities, use Theorem 1, and multiply the resulting 
matric identities by the appropriate inverses.] 

4 If 4 is a diagonal matrix and p is any scalar polynomial, show that p(A ) is also a diagonal 
matrix. 

5 If A is a diagonal matrix and / is a rational fractional function, is f(A) necessarily a diagonal 
matrix? 

6 Show that J 2 has infinitely many distinct square roots. 

7 Prove that an (n,n) matrix with distinct characteristic values has no square roots other than 
those identified by Corollary 1, Theorem 7. (Hint: Use the result of Exercise 6, Sec. 10.2.) 

8 Prove Corollary 2, Theorem 5. [Hint: First prove the assertion for the special polynomials 
p(A) - A k by premultiplying the equation AXi = X;X; by A, A*, ... , A k ~ l , in turn.] 


516 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


9 By actually constructing an infinite family of solutions, show that each of the following 
matric equations is satisfied by infinitely many matrices: 


a X s - 2X 
c X s - 4X ~ 51, = 0 


b X* - 4X + 3/ s - 0 
d X a - 6X S + HZ -6/3 = 0 


10 Without attempting to find the solutions, show that, for all values of a and b, the equation 
X* + aX + bit = 0 is satisfied by infinitely many matrices. What do you think is the 
generalization of this result to equations in an (n,n) matric variable? 

11 Show that the following matric equations have no solutions: 

a X s — 2X — 3/j = II “J M| b X 2 - 4X + 3/ 4 = || || 


c X* - 4X - 51, 


d X s - 4X + 3/, 


12 If A and B commute, show that A commutes with any polynomial in B. 

13 Verify each of the following identities for X <= II 1 \ II and X == I] ^ \ 


a (X - iy « X s - 2X + 1 


X* - / X-/'X + / 


b X* - I - (X - /)(X 2 + X + /) 


14 If A k *■ 0 for some positive integer k, prove that every characteristic value of A is zero. 
16 Solve each of the following matric equations: 


a X* - 5X + 3/ = 


b X s + 6X + 9/ = 


16 If f(x) = xf(x + 4), compute f(A) for each of the following matrices A : 


d I -3 1 0 

1 -4 0 

0 0-5 


- 3z 5 4* 4x + 2, evaluate -p(A) for each of the following matrices A: 


2 1 1 

14 3 

-1 -1 0 


18 Verify that the characteristic values of f(A ) are equal to /(X<) for each of the following func- 
tions and each of the given matrices: 

a x* - 2x + 3 bz 1 — 4x + 3 c z a + z s + 2 + 1 


SEC. 11.5 


THE CAYLEY-HAMILTON THEOREM 


517 


19 Verify that the characteristic values off (A) are equal to /(X,) for each of the following func- 
tions and each of the given matrices: 


X 

b ^ 

x + 1 


* + 2 

x i + x 4- 1 

II- 1 2 II 

a || 4 2 1| 

iii || 5 2 1| 

II -1 2 || 

||1 3 II 

II 2 2 1 


20 If p is a polynomial, is it possible for p(A) to have a characteristic vector which is not a 
characteristic vector of .4? Justify your answer. 


1 1.5 

The CaySey-HesmiUtom theorem 

Since a square null matrix, being a diagonal matrix, is obviously 
similar to a diagonal matrix, it follows from Theorem 7, Sec. 11.4, 
that the equation p(X) — 0 is always solvable. On the other 
hand, it is not immediately evident that, given a square matrix A, 
there is always a polynomial equation with scalar coefficients 
p(X) = 0 of which A is a solution. This is the case, however, 
and it is not difficult to show (see Exercise 1) that any square 
matrix of order n satisfies a polynomial equation whose order 
is at most n 2 . In fact, for any square matrix A, there is always a 
polynomial equation of order n which is satisfied by A. 

To prove this, it is convenient to prove first a preliminary 
result concerning polynomials whose coefficients are not scalars 
but square matrices. Before we can do this, however, it is neces- 
sary that we define what is meant by the value of such a poly- 
nomial, say 

F(k) - Co + C,X + • • • + C k \ k 

when a square matrix A is substituted for the scalar variable X. 
Since matric multiplication is not commutative, it is clear that, 
in general, the various powers of A will not commute with the 
coefficient matrices in F(\). Hence, although it is true that 

Co + CiX + • • • + C*X* « Co + \Ci + • ■ • + x*c* 
the corresponding matric relation, namely, 

Co + C a A + • • • + C k A k - Co + ACi + ■ • • + A*C k 

is in general false. Thus it is necessary for us to assign a meaning 
to F(A), and this we do by agreeing that 

F(A) = Co + C X A + * • * + C k A k 

Now we have already seen (Theorem 1, Sec. 11.4) that 
identities relating scalar polynomials imply corresponding identi- 
ties when the scalar variable is replaced by a square matrix. 


518 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


This is not true, however, for identical relations involving poly- 
nomials with matric coefficients. For instance, if 

F(X) = C 0 + Ci\ and C(X) = D 0 + DiX 

then, for the product F(X)C(X), we have 

P(X) = CoDo + (C 0 Di + CiD 0 )\ + C 1 D 1 \* 

On the other hand, if we replace the scalar X by a square matrix A, 
we have 

F(A) = Go + CiA G(A) = Do + D,A 
and F(A)C(A) = CoD 0 -}- C 0 D 1 A + CiAD 0 + CiAD t A 
which is not equal to P(A) unless 
AD 0 «= D 0 A and AZ>i = D\A 

However, we can prove the following theorem, which is the neces- 
sary preliminary result we mentioned above: 

THEOREM 1 

If F(X) and P(X) are polynomials in the scalar variable X with coefficients which 
are square matrices and if P(X) = F(X)(A — X/), then P(A) = 0. 

PROOF In view of the fact that we have just seen that P(X) = F(X)C(X) does 
not imply that P(A) = F(A)(?(A), we cannot prove this theorem simply by sub- 
stituting A for X in the assertion of the theorem. Instead we must first multiply 
out the right-hand side of the given relation, express it as a polynomial in X, and 
then replace X by A. To do this, let us suppose that 

F(X) = C 0 + C x X + C 2 X 2 + • • • + C*X* 

where Co, Ci, C 2 , . . . , C k are (n,n) matrices. Then 

P(X) = (Co + CxX -f C 2 X 2 + • • • + CitX fc ) (A - X7) 

- C 0 A + C x AX + C 2 AX 2 + • • • + C*AX* 

- CoX - C1X 2 - • • ■ - C k -ik k - C k \ k+1 
= CoA + (ClA - Co)X ■+ (C 2 A - Ci)X 2 

+ ( C k A - C*_0X* - C*X* +i 

Now, substituting A for X, we have 

P(A) = C 0 A + (CxA - C„)A + (C 2 A - C k )A 2 + • • • 

+ (C*A - C*_x)A* - C*A*+* 
= 0 as asserted. 

We are now in a position to prove one of the most important 
results in the theory of matrices, the famous Cayley-Hamilton 
theorem : 


SEC. 11.5 


THE CAYLEY-HAMILTON THEOREM 


519 


THEOREM 2 

Every square matrix satisfies its own characteristic equation. 


PROOF Let A be an ( n,n ) matrix whose characteristic equation is 



an — X 

ai2 

■ a in 

|A - X7| = 

a 2 i 

a 22 -X • 

' a 2n 


a„i 

a„ 2 

■ ’ a„„ — X 


- (-1)”[X» - M"- 1 + • • • + (-1)%] = 0 


The adjoint of the matrix A — XI is clearly an (n,n) matrix whose elements, 
being the cofactors of the elements of the determinant |A — X7|, are polynomials 
in X; that is, 



Pn(X) 

Pia(X) ' 

• Pm(X) 

adj (A — X7) = 

?2l(X) 

P2a(X) * 

• p 2n (X) 


Pnl(X) 

p« 2 (X) • 

Pnn(X) 


Furthermore, from the definition of matric addition, it follows that the last matrix 
can be written as a polynomial in X, say F(\), whose coefficients are (n,n) matrices, 
the element in the fth row and jt\i column of the matric coefficient of X k being 
the coefficient of X* in p»y(X). Now, from Corollary 1, Theorem 1, Sec. 10.3, we have 


that is, 


adj {A — X7) • (A — X7) = | A - X7|7 

= (-1)"[X»7 - frx-ir + • * • -F (— 1)”/3„7] 

( — l) n [X”7 - Pdr-'I + • • • + (— im.7] = F(X)(A - X7) 


But this is a relation between polynomials in X with matric coefficients of pre- 
cisely the type covered by Theorem 1. Hence, the left-hand side must vanish 
when X = A. In other words, 


A" _ filA »-i + • • • + = 0 

that is, the matrix A satisfies its own characteristic equation, as asserted. 


Using the Cayley-Hamilton theorem, the nth power of any 
square matrix A can be expressed as a linear combination of 
lower powers of A. Hence, by repeated applications of the Cayley- 
Hamilton theorem, any positive integral power of A and, there- 
fore, any polynomial in A can be expressed as a polynomial in A 
of degree at most n — 1. Moreover, if A is nonsingular, then A" 1 
exists, and, in the expansion of | A — X7|, the constant term /3„ = 
\A\ is different from zero. Hence, we can multiply the Cayley- 
Hamilton equation 

A n - /M"- 1 + • • • + (-1)*AJ = 0 
by A~\ getting 

A"" 1 - M n ' 2 +••••+ (— l) n_1 /3„_i7 + (-l^A" 1 = 0 


520 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


whence, solving for A -1 , we find 

A -* = [ A ”- 1 - M n ~ 2 + • • • + (-1 

Pn 

In some cases this is a convenient method of obtaining the inverse 
of a matrix A. 

EXAMPLE 1 

II ~ 4 5 5 II 

If A — — 5 6 5 1 we find by an easy calculation that 

II “5 5 6 1| 

| A - A7| - -X» + 8X 2 - 13X + 6 
Hence, by the Cayley-Hamilton theorem it follows that 
A 8 - 8A 2 + 18A - 6/ - 0 

as can easily be verified by direct calculation. Using this relation, we can now express higher 
powers of A as quadratic polynomials in A . For instance, 

A* = A': A* = A(8A 2 — 13A + 6J) 

=» 8A S — 13A 2 + 6A 
= 8(8A 2 - 13A + 61) - 13A 2 + 6A 
« 51 A 2 - 98 A + 48/ 

and A 6 = A • A* = A (51 A 2 - 98A + 481) 

= 51(8A 2 - 13A + 6 7) - 98 A 2 + 48A 
= 310A 2 - 6 15 A + 306 7 

Similarly, multiplying the Cayley-Hamilton equation through by A -1 and then solving for A -1 , 
we find 

A-» - M(A 2 - 8A + 137) 


-34 35 35 

-4 5 5 


1 0 0 

\ 

-35 36 35 

- 8 -56 5 

+ 13 

0 1 0 


-35 35 36 

—5 5 6 


0 0 1 

/ 


Jill -5 -5 II 
5 1 " 5 
6 II 5 -5 l || 

The Cayley-Hamilton equation is not necessarily the poly- 
nomial equation of lowest degree satisfied by a given square 
matrix. For instance, it is easily verified that the matrix A in the 
last example satisfies not only the Cayley-Hamilton equation 

A 3 - 8 A 2 + 13 A - 61 = 0 
but also the simpler, quadratic equation 
A 2 — 7A + 6/ = 0 
DEFINITION 1 

If A is a square matrix, any polynomial p with the property that p(A) = 0 is 
said to annihilate A. 


SEC. n.5 


THE CAYLEY-HAMILTON THEOREM 


521 


Let us now consider the set of polynomials of minimum 
degree which annihilate a given square matrix A, and, for 
definiteness, let us assume that by multiplying them by suitable 
constants their leading coefficients have been made equal to 1. 
Clearly, all these polynomials are identical. In fact, if this is not 
the case and if there are two such polynomials / and g, then 
/i = / — g is a polynomial whose degree is lower than the degree 
of / and g such that 
h(A) = /(A) - g(A) = 0-0 = 0 

But, by hypothesis, / and g are polynomial annihilators of A of 
minimum degree. Hence we have a contradiction unless h is 
identically zero, that is, unless / and g are the same. 

We are thus justified in introducing the following definition: 

DEFINITION 2 

The unique polynomial with leading coefficient 1 and of minimum degree which 
annihilates a square matrix A is called the minimum polynomial of A. 

Among the properties of minimum polynomials, the follow- 
ing are -worthy of mention here : 

THEOREM 3 

Similar matrices have the same minimum polynomial. 

PROOF Let A and B be similar matrices, so that B = 3 -1 A£. Then by Lemma 
2, Sec. 11.4, for any polynomial p, 
p(B) = S~ 1 p(A)S 

From this we conclude that any polynomial which annihilates A also annihilates 
B, and conversely. Hence, the minimum polynomials of A and B must be the 
same, as asserted. 

THEOREM 4 

The minimum polynomial of any square matrix A is a divisor of any polynomial 
which annihilates A. 

PROOF Let the minimum polynomial of a matrix A be f(x), and let cj>(x) be 
any polynomial with the property that <£(A) = 0. Then, by the division algorithm 
of elementary algebra 

4>(x) = q(x)f(x) + r(x) 

where the remainder polynomial r(x) is either identically zero or of lower degree 
than the divisor polynomial /(x). Then, by Theorem 1, Sec. 11.4, 

<»(A) - q(A)f(A) + r(A) 

However, by hypothesis, <t>(A) — 0 and /(A) = 0. Hence, r(A) — 0, and, there 
fore, r(x ) is identically zero; for, if it were not, then it would be a polynomial 
which annihilated A and whose degree was less than the degree of the minimum 
polynomial of A, namely, /(x). But if r(x) = 0, then the minimum polynomial is a 
factor of 4>{x), as asserted. 


522 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


THEOREM 5 

If the characteristic roots of a matrix A are all distinct, then the characteristic 
polynomial and the minimum polynomial of A are the same, except possibly for 
sign. 

PROOF Let A be a matrix with distinct characteristic roots Xi, X 2 , . . • , X„, 
and let the characteristic polynomial of A be 

/(X) = (XV- X)(X 2 - X) ■ • • (X„ - X) 

Then, since the minimum polynomial of A, say ^(X), must be a factor of /(X), it 
follows that, if/(X) and g(\) differ in more than sign, then g(\) must be the product 
of some but not all of the factors of /(X). Specifically, suppose that g(K) does not 
contain the factor (X, — X). Now, by Theorem 5, Sec. 11.4, the characteristic 
roots of the matrix gr(A) ax-e g{\i), ^(X 2 ), . . . , g(h„). However, since gr(X) does 
not contain the factor (Xi — X), it follows that g(\i) ^ 0. Hence, g(A) has at least 
one nonzero characteristic root. But, if this is the case, then g(A) is not a null 
matrix; that is, g(A) ^ 0, contrary to the hypothesis that g is the minimum 
polynomial of A. This contradiction shows that/(X) and g(\) cannot differ except 
possibly in sign, and the theorem is established. 

As an interesting application of the theory of the minimum 
polynomial of a matrix, we have the following result : 


THEOREM 6 

If A is a square matrix and if / (m) and g(x) are scalar polynomials such that g(A) 
is nonsingular, then f(A)/g(A) is equal to a' polynomial in A. 

PROOF Since, by definition, f(A)/g(A) - f{A)g~ 1 (A), it is clearly sufficient 
to prove that gr^A) is a polynomial in A. To do this, let 

</>(x) = X k + C\X k ~ l +••'■+ Ck-iX + c* 
be the minimum polynomial of the matrix G — g(A ). Then 

0(G) - G k + cxf?*- 1 + ■.'•• + Ck-xG + c k I = 0 
and from this, by multiplying through by G~ l and transposing, we obtain 
CkG- 1 « -(G*- 1 + cxGt-* + • • • + cm/) 

Now c k 0, for otherwise the right-hand side of the last equation is a polynomial 
which annihilates G and whose degree is less than the degree k of the minimum 
polynomial of G. Hence, we can divide bye* and obtain G -1 as a polynomial in A. 
Finally, substituting g(A) for G in the expression for G~ l , we obtain G~ l = gr^A) 
as a polynomial in A, as required. It is important to note that the structure of the 
polynomial in A to which g~ 1 (A) is equal depends upon A as well as upon g. 
Hence, if /(A)/p(A) = h(A), we cannot conclude that for another matrix B we 
necessarily have f(B)/g(B) = h(B). 

As we have seen, by successive applications of the Cayley- 
Hamilton theorem it is possible to reduce any polynomial in an 
(n,n) matrix to another polynomial in A whose degree is at most 


SEC. 11.5 


THE CAYLEY-HAMUTON THEOREM 


523 


n — 1. The use of the Cayley-Hamilton theorem is not always 
the most convenient way to accomplish this reduction, however, 
and, when the characteristic values of A are all distinct, it is 
sometimes easier to proceed as follows: 

Knowing that, for any polynomial p and any (n,n) matrix A, 
the matrix p(A) can be expressed as a polynomial, say <f>(A), of 
degree at most n — 1, let us write 

<*>(X) = c x [(\ - X 2 )(X - X s ) * * • (X - X„)l 

+ c 2 [(X - Xi)(X - X 3 ) • • • (X - X„)] + • ■ • 

4" c n [(X — Xx)(X — X 2 ) • • • (X — X„_i)] 

where the fth term on the right is the product of all the factors of 
the characteristic polynomial of A except X — X,-. Clearly, if 
ci, c 2 , ... ,c n are arbitrary and the X’s are all distinct, the right- 
hand side is an arbitrary polynomial of degree n — 1. Then, 

(1) P(A) = ci [(A - X 2 7) (A - X 3 7) (A — X n 7)] 

+ c 2 [(A - \xI)(A - X 3 7) • • • (A - X„7)] 4- • • • 

4- c n [{A - \ J){A - X 2 7) • • • (A - X_i7)] 

Now r , if Xk is a characteristic vector of A corresponding to the 
characteristic value X*, it follows that 

(2) (A - ~K k I)X k = 0 
Moreover, 

(A - \jI)X k = [A - X*7 + (Xjt - \j)I]Xk 
= (A- \ k I)X k 4- (X» - \j)X k 

(3) =(X*-X,)X* 

Hence, if we postmultiply Eq. (1) by X k and simplify the products 
by successively applying Eqs. (2) and (3), we find that every 
product vanishes except the Mh, and we have 

(4) p{A)X k — Cjfc[(Xi — Xi) ' ‘ ’ (Xfc — Xfc— i) (Xji, — Xfe+i) • • ' (Xfc — \ n ))X k 

Furthermore, according to Corollary 2, Theorem 5, Sec. 11.4, X k 
is a characteristic vector of the matrix p(A) corresponding to the 
characteristic value p(\ k ). Therefore, 

\p{A) - p{\ k )I]X k = 0 or p(A)X k = p(\ k )X k 
Thus, Eq. (4) becomes 

p(\ k )X k — Cfc[(Xj; — Xi) * • • (Xjt — Xfc_ i) (Xfc — Xfc+i) • • • (Xfc — X„)]Xfc 
which implies that 

2>(Xfc) 


p(Xfc) = Ci JJ (Xfc — X,-) 


or 


Cfc = 


S24 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


( 5 ) 


Therefore, substituting these values for the c's into Eq. (1), we 
obtain the identity 



- g( ^ ~ ■ n m - xj)' 

i= i 


This important result is known as Sylvester’s identity.* It may 
be extended to cover the case in which the characteristic values 
of A are not all distinct, but we shall not undertake this extension, f 


EXAMPLE 2 

j| —15 -14 -40 II 

If A - 6 7 14 

II 5 4 14 1| 

express p(4) = 4° — 6A 6 4- 12 A 4 — 12 A 3 + 124 s — 8.4 + 3Z in as simple a form as possible. 
By a straightforward calculation we find, for the characteristic equation of A, 


\A - XI | 


-15 -X -14 -40 

6 7 - X 14 

5 4 14 - X 


-X 3 + 6X S - 11X + 6 « 0 


Hence, the characteristic values of .4 are X, = 1, Xs = 2, X 3 = 3. Therefore, 


p(M = p(l) = 2 
p(X 2 ) = p(2) - 3 
P(X 3 ) = p(3) = 6 
and, substituting into Eq. (5), 


P(A) - 


(1 — 2)(1 -3) 


(A - 2 1)( A - 31) + 


(2 - 1)(2 - 3) 


(4 - I)(4 - 31) 


(3 - 1) (3 - 2) 


(4 - I)(4 - 21) m 4 s - 24 + 31 


1 

2 


3 


EXERCISES 

Without using the Cayley-Hamilton theorem, prove that every (n,n) matrix satisfies a 
polynomial equation of degree at most n s . 

Using the Cayley-Hamilton theorem, find the inverse of each of the following matrices: 


a 

2 -4 -4 

b 

2 1 1 

c 

-4 —9 -3 


1 -4 -5 


1 4 3 


1 4 1 


-1 4 ’ 5 


-1 -10 


3 3 2 


Find the minimum polynomial of each of the following matrices: 


1 1 1 
1 1 1 


b 



2 

-2 

3 


c 111 
1 1 -1 

-1 1 3 



2 

-1 

2 


-2 

2 

-1 


* Named for the English algebraist J. J. Sylvester (1814-1897). 
t See, for instance, W. J. Duncan, R. A. Fraser, and A. R. Collar, “Ele- 
mentary Matrices,” pp. 78-79, Cambridge University Press, New York, 



sec;. n,6 


INFINITE SERIES OF MATRICES 


525 


4 Using both the Cayley-Hamilton theorem and Sylvester’s identity, evaluate 


p{A) « A & - A i - 2 A 3 + A 2 + A - 3 1 
for each of the following matrices A : 


■ l :!ll 

d 2 -4 -4 II 

i ill 


b I: ill 

e I -1 1 1 

2-2-3 
1-2 2 3 


c -2 
2 


6 


If f(x) - x/(x + 4), express f(A) 

‘ Jill 

d -4 -9 -3 e 

1 4 1 

3 3 2 


as a polynomial in A, for each of the following matrices .4 : 



11.6 


infinite series of matrices 


In the last two sections we have considered polynomial and 
rational fractional functions of a square matrix. In this section 
we shall conclude our survey of the theory of matrices by inves- 
tigating briefly, and for the most part without proof, infinite 
series of matrices. Once we have suitable criteria for the con- 
vergence of infinite series of matrices, it will then be possible to 
define and study transcendental functions of square matrices, 
such as e A , sin A, and cos A, by allowing the corresponding scalar 
series to have matric arguments. As our experience with scalar 
series suggests, we must begin with the concept of the convergence 
of a sequence of matrices : 


DEFINITION 1 

If A i, A 2 , . . . ,A„, . . . is a sequence of (p,q) matrices, if (<*#)» is the element in 
the ith row and jth column of A„, and if ay is the corresponding element in a 
(p,q) matrix A, the sequence Ai, A 2 , . . . , A„, . . . is said to converge to the 
matrix A if and only if, for all values of i and j, lim (ay) n = ay. 

We indicate that a sequence of matrices {A„j converges to a 
matrix A by writing A n — > A. A sequence of matrices which does 
not converge is. said to diverge or to be divergent. 

According to Definition 1, the convergence of a sequence of 
(p,q) matrices depends on the convergence of pq scalar sequences. 
Hence, it is only an easy application of familiar ideas to prove the 
following results: 

LEMMA 1 

If {A,,} and {B„} are two sequences of ( p,q ) matrices and if A„ — » A and B n — *• B, 
then, for all scalar constants a and /3, aA n + @B n — » aA + (3 B. 



>26 


FURTHER PROPERTIES OF MATRICES 


LEMMA 2 

If {^4.„ } and \B n ) are two sequences of suitably conformable matrices and if 
A n ~* A and B„ — * B, then A n B n —* AB. 

LEMMA 3 

If |ii n } is a sequence of (p,g) matrices and if A n —* A, then, for suitably conform- 
able matrices R and S, RAJS — » RAS. 

PROOF From the definition of matric multiplication, the element in the tth 
row and jth column of the product RA n S is 
v a 

y y rikia^nsij 

1 

Since n k and si, are clearly independent of n and since, by hypothesis, (a k i) n —* a k t, 
it follows that this finite sum converges to 

X X TikdkiSu 

M 1 = 1 

which is the element in the t'th row and jth column of the product RAS, as asserted. 

Using the fact that any square matrix is similar to an upper 
triangular matrix, it is possible to prove the following important 
theorem:* 

THEOREM 1 

If A is a square matrix, a necessary and sufficient condition that A" — * 0 is that 
the absolute value of each characteristic root of A be less than 1. 

We are now in a position to define the convergence of an 
infinite series of matrices : 

DEFINITION 2 

The series of matrices ^ c m A m is said to converge to the sum $ if and only if the 

TO = 0 

sequence of partial sums 

I&.I = { 2 l»} 

m«=0 

converges to S as n becomes infinite. 

DEFINITION 3 

A series of matrices ^ c m A m is said to converge absolutely if and only if each 

m = 0 

scalar series ^ is absolutely convergent. 

ro=0 

As criterion for the absolute convergence of a series of ma- 
trices, we have the result contained in the following theorem: 


See, for instance, Mirsky, “Linear Algebra,” p. 328. 


SEC. 11.6 


INFINITE SERIES OF MATRICES 


527 


THEOREM 2 

If a m is the element of maximum absolute value in the matrix A m , then a necessary 
and sufficient condition that the series ^ c m A m converge absolutely is that the 

m = 0 

scalar series |c m | • \a m \ converge. 

= o 

PROOF To prove the necessity of the condition of the theorem, let us assume 
that the given series c m A m is absolutely convergent. Then, if A m is a (p,g) 

m = 0 

matrix, it follows that, for all values of i and j such that 1 i g p and 1 £ j £ q, 
the series V c m {aij) m is absolutely convergent; that is, Y |c m | • |(<iy) m | converges. 

m = o m = 0 

Let L ( ^ 0) be the largest of the sums to which these pq series converge. Then, 
clearly, for all values of i and j and for every n > 0, 

| M • IM.I a l 

m = 0 

If we now sum the last inequality for all values of i and j, we obtain 

2 2 M - Ko«)-l - 2 L = vqL 

From this, by reversing the order of summation on the left and noting that |c m j is 
independent of i and j, we have 

(1) 2 WXlW-ISMi 

m — 0 i,j 

Now the absolute value of the element of largest absolute value in A m surely can- 
not exceed the sum of the absolute values of all the elejnents in A n \ that is, 

Hence, using this to underestimate the left member of (1), we have 
2 M ‘ M ^ V<L L 

Thus the series ^ |c m | • |a m | converges, since its partial sums form a bounded 

m = 0 

monotonically increasing sequence, and the necessity of the condition of the theo- 
rem is established. 

To prove the sufficiency of the condition of the theorem, let us suppose that 
J | c m | • |om| converges. Then, since |(a<y) m | ^ |a m | for all i and j, it is clear that 

rn = 0 

|cm| • |(oy-) TO | |cm| • |a m | for all i and j. Hence, by an easy application of the com- 
parison test for the convergence of scalar series, it follows that £ |c m | • |(a«)»| 

m-Q 

converges for all i and j, which proves that the given series of matrices converges 
absolutely, as asserted. 


528 


FURTHER PROPERTIES OF MATRICES 


CHAP. 11 


If a series of matrices is absolutely convergent, it follows at 
once from the corresponding properties of scalar series that the 
matric terms can be rearranged at pleasure without in any way 
affecting the sum of the series. 

With the exception of Theorem 1, all the observations we 
have so far made about sequences and series of matrices apply to 
matrices of any shape. However in most applications we are con- 
cerned only with series of square matrices and, in particular, with 
power series of such matrices. The fundamental theorem on 
matric power series is the following:* 

THEOREM 3 

If the absolute value of each characteristic root of a square matrix A is less than 
the radius of convergence of the scalar power series <f>(z) — V c m z m j then the 

TO= o 

matric power series 4>{A) — ^ c m A m converges. If the absolute value of at least 

»i = 0 

one characteristic root of A is greater than the radius of convergence of 
then 4>(A ) diverges. In particular cases, if the absolute value of the characteristic 
root of largest absolute value is equal to the radius of convergence of 4>(s), the 
matric series <P(A) may either converge or diverge. 

COROLLARY 1 

If <t>(z) converges for all values of z, then 4>(A) converges for all square matrices A. 


COROLLARY 2 

If <t>(A) — ^ c m A m converges and if A is similar to a diagonal matrix; that is, if 


S~'AS - D - 


<f>(A) — S 


Xl x o 

A2 

O ' X. 

<^(Xx) O 

>(X 2 ) w 

O *(X . ) 


PROOF Clearly, the partial sums of the series (j>(A) are all polynomials in A. 
Hence, by Theorem 6, Sec. 11.4, 


<I>n(A) ~ ^ c m A m — S 


<Aiv(Xi) 

^jv(X 2) 

O 

O 

4>n(K) 


See, for instance, Mirsky, “Linear Algebra/’ p. 332. 



SEC. 11.6 INFINITE SERIES OF MATRICES 


529 


Now, by hypothesis, 4>(A ) converges. Hence, each of the scalar series <£(\ £ ) must 
converge; that is, for each i, ^(Xi) — *> $(X,). Therefore, as N <x>, 


</w(Xi) 


0.v(X 2 ) 


o 

and we have, by Lemma 3, 

<KA) = S 


o 

<£w(X») 


^(Xx) P) 

$(X 2 ) 


O 


0(X„) 


<KX0 

O 

*(x 2 ) 

O 

<KX») 


The last result provides us with a useful method of evaluating 
the sum of certain matric power series, which is sometimes 
preferable to the use of the Cayley-Hamilton theorem or Syl- 
vester’s identity. 


EXAMPLE i 
What is e* if A = || J ~g ||? 

The given matrix A is the one we considered in Example 3, See. 11.4. Hence, from that 
example we know that the characteristic equation of .1 is A 2 — 3A + 2 = 0 and that Ai = 1, 
As = 2, and 


||l o|| 

II 2 1 || 


1 i 

~ 1 0 2 1 

where S « _j 

and 

-1 -2 


Therefore, by the last corollary, 

e II <Kh) 


0 

<^(A 2 ) 


2e - e 2 
— e + e 2 


2e - 2e a 
— e + 2e 2 I 


To evaluate e A by means of the Cayley-Hamilton theorem, we must simplify each of the 
powers of .4 in the expansion of e A , using the relation A- — 3 A +2 1 - O, or 

.4 2 = 3/1 - 21 

At first glance this would seem to be a very tedious process, since arbitrarily high powers of A are 
involved. However, we may shorten the work appreciably by proceeding inductively: From the 
fact that the characteristic equation of .4 is quadratic, we know that any positive integral power 
of A can be expressed as a linear binomial in A. Hence, if we assume 


we have 


A”=a n A+b n r n = 2,3,4, 
A»+i = a n+1 A + b n+ J = A- A” -- 


Therefore the a’s and b's satisfy the recurrence relations* 

a n + 1 = 3a„ + b n and 6„ + i = — 2a„ 


= a„(3A - 21) + b n A 
= (3a„ +• b n )A - 2a n I 


These are examples of what are known as difference equations; see Sec. 4.5. 


FURTHER PROPERTIES OF MATRICES 


with, of course, the initial conditions as = 3, 62 = —2. From these it is easy to verify that 
as = 3 bs — — 2 


03 — 7 

04 = 15 
o 6 = 31 


b 3 = -6 
fe 4 « -14 
bi = -30 


l = J + 4 + |_ + |r + . . . 

. . . . (2 8 -l)A + (-2 8 + 2)J , (2 3 — 1)A + (— 2 3 + 2)1 , 

= t+ A + 2 ! + 3 ! + ‘ ' ‘ 

+ / [ 2 ( 1+ l! + 5 + 5! + ' ")'"( I+ ^ + li + l + " )] 

= A(e 2 - 2) - J(e 2 - 2e) 

-M? -?!-<•— >11; ;i 

|| 2e — e 8 2e - 2e 8 1| 

~ li — e 4- c 8 — e + 2e 2 || 


as before. Finally, using Sylvester’s formula, we have 
4>(A) 


~ — (A — X 2 7) + — (A — Xi I) 

Al — Ao A2 — Al 


= — - (A - 27) + - (A - I) 


= A(e 8 - e) - 7(e a - 2e) 

which is the first expression we obtained using the Cayley-Hamilton theorem. 

The question of whether or not scalar identities such as 
e x e y — e x+ v> s j n 2x as 2 sin x cos x, and 

cos (x + y) = cos x cos y — sin x sin y 

remain valid when x and y are replaced by square matrices is 
obviously an important one. We cannot investigate the matter 
here, but the applications one is likely to encounter are covered 
by the following theorem:* 


THEOREM 4 

For the elementary transcendental functions, scalar identities in a single variable 
remain valid when the scalar variable is replaced by a square matrix. Scalar 
identities in two variables remain valid when the scalar variables are replaced by 
square matrices only when the two matrices commute. 


See, for example, Mirsky, “Linear Algebra,” pp. 338-341. 


SEC 11.6 


INFINITE SERIES OF MATRICES 


531 


EXERCISES 


1 Prove Lemma 1. 

2 Prove Lemma 2. 

8 For the matrix A of Example 1, compute e~ A and verify that e A e~ A — I. 

4 Using Sylvester’s identity, compute e A , cos A , and sin A for each of the following matrices .4 : 

-111 


4 —3 II 
-1 1 
2 -2 
-2 2 


Using both the Cayley-Hamilton theorem and Sylvester’s identity, evaluate e A for 

A- II 1 “Ml. 

II “I 0 || 

For each of the following matrices A, verify that sin 2A ~ 2 sin A cos A: 


If .4 = 
holds: 


i\\ 


1 

-2 || 
2 

-1 

2 


, verify that none of the following relations holds : 


sin (A ± B) => sin i4 cos B ± cos A sin B 
cos (it ± jB) = cos A cos jB ? sin A sin B 


% 

M | and B - 

H 


-H 

H | 

-H 

-H ’ 


verify that each of the following relations 


sin (A ± jB) = sin A cos B ± cos it sin B 
cos (A ± B) — cos A cos B + sin A sin B 
Determine for what values of x and y, if any, the matrices 

A = 


X X — 1 

and 

b - y *-* 

— X 1 — X 


-1 - y —y 


commute, and verify that, under these conditions, 

sin (A ± jB) = sin A cos B + cos A sin B 
cos (A + JB) = cos A cos B + sin A sin B 


0 


— 1 — 2 , show that sin A 

|| 2 -2 o||. 

s A, and verify that cos 2 A + sin 2 A = I. 


, sin 3 


Obtain a similar expression for 


Vector Analysis 


12.1 

The algebra of vectors 

In Sec. 10.2, in our discussion of determinants and matrices, we 
introduced the concept of a vector as an ordered set of n quantities, 
say (ax,a 2 , . . . ,a„). In the present chapter we shall undertake 
the study of what is known as vector analysis, using the more 
traditional (and limited) definition of a vector as a quantity, such 
as force, velocity, or acceleration,, which possesses both magnitude 
and direction. Although this approach has been all but abandoned 
in pure mathematics because it is unnecessarily restricted, it is still 
the usual approach in physics and in engineering, and the results 
to which it leads are of great utility in these fields. 

Almost any physical discussion will involve, in addition to 
vector quantities, other quantities, such as volume, mass, and 
work, which possess only magnitude and are known as scalars. To 
distinguish vectors from scalars we shall consistently write the 
former in boldface type; thus, V. This is a rather common nota- 
tion, although some authors indicate that, a symbol stands for a 

vector quantity by putting an arrow above the symbol; thus, V. 

A scalar quantity can be adequately represented by a mark 
on a fixed scale. To represent a vector quantity, however, we must 
use a directed line segment whose direction is the same as the 
direction of the vector and whose length is equal (on some con- 
venient scale) to the magnitude of the vector. For convenience, 
we shall often refer to the representative line segment as though it 
were the vector itself. The magnitude or length of a vector A is 
called the absolute value of the vector and is indicated either by 
enclosing the symbol for the vector between ordinary absolute- 
value bars or simply by setting the symbol for the vector in 
ordinary rather than boldface type. Thus, 

A = |A| 


SEC. 12.1 


THE ALGEBRA OF VECTORS 


533 


represents the magnitude, or absolute value, of the vector A. 
Regardless of its direction, a vector whose length, or absolute 
value, is unity is called a unit vector. A vector is said to be zero 
if and only if its absolute value is zero. The direction of a zero 
vector is undefined. 

Two vectors whose magnitudes, or lengths, are equal and 
whose directions are the same are said to be equal, regardless of 
the points in space from which they may be drawn. * If two vec- 
tors have the same length but are oppositely directed, either is 
said to be the negative of the other. 

The sum of two vectors A and B is defined by the familiar 
parallelogram law; i.e., if A and B are drawn from the same point, 
or origin, and if the parallelogram having A and B as adjacent 
sides is constructed, then the sum A + B is the vector represented 
by the diagonal of this parallelogram which passes through the 
common origin of A and B (Fig. 12.1a). From this definition it is 
evident that 




A + B = B + A 

i.e., that vector addition is commutative, and. that 
A + (B + C) = (A + B) + C 

i.e., that vector addition is associative. By the difference of two 
vectors A and B, we mean the sum of the first and the negative of 
the second; i.e., 

A - B = A + (— B) 

(Fig. 12.1 b). By the product of a scalar a and a vector A we mean 
the vector aA whose length is equal to the product of jaj and the 


* In other words, a vector quantity can be represented equally well by any 
of infinitely many equivalent line segments, all having the same length and 
the same direction. It is, therefore, customary to say that a vector can be 
moved parallel to itself without change. In some applications, however, as 
for instance in dealing with forces whose points of application or lines of 
action cannot be shifted, it is necessary to think of a vector as fixed, or at 
least limited in position. Such vectors are usually said to be bound, in eon- 
trast to unrestricted vectors, which are said to be free. 


534 


VECTOR ANALYSIS 


CHAP. 12 


( 1 ) 


FIGURE 12.2 
The geometrical 
f interpretation of 

the scalar 
product. 


( 2 ) 


(3) 


(4) 


magnitude of A and whose direction is the same as the direction 
of A if a is positive and opposite to it if a is negative. 

In addition to the product of a scalar and a vector, two other 
types of product are defined in vector analysis. The first of these 
is the scalar or dot or inner product, indicated by placing a dot 
between the two factors. By definition, this is a scalar equal to 
the product of the absolute values of the two vector factors and 
the cosine of the angle between their positive directions; i.e., 

A « B = |A| |B| cos 9 = AB cos 6 

Since |A| cos 9 is just the projection of the vector A in the direction 
of B and since |B| cos 9 is the projection of the vector B in the 
direction of A, it follows that the dot product of two vectors is equal 
to the length of either of them, multiplied by the projection of the other 
upon it (Fig. 12.2). Two particular cases of this are worthy of 



note: If one of the vectors, say A, is of unit length, then A • B 
becomes simply 

|B| cos 9 = B cos 6 

which is just the projection, or component, of B in the direction of 
the unit vector A. On the other hand, if A = B, then, obviously, 
cos 9 = cos 0 = 1, and we have 

A • A - |A|* - A 2 

From the relation between dot products and projections it is 
easy to show that dot multiplication is distributive over addition; i.e., 

A*(B-fC) = A * B + A ■ C 

Moreover, from the definitive relation (1) it is clear that dot 
multiplication is commutative; i.e., 

A-B = B- A 

However, if the dot product of two vectors is zero, it does not 
follow that one or the other of the factors is zero, for there is a 
third possibility, namely, cos 0 = 0, Thus if A * B = 0, then either 
at least one of the vectors (A,B) is zero or A and B are perpendicular. 

The third type of product with which we shall deal is the 
vector, or cross product, indicated by placing a cross between the 


SEC. 12,1 


THE ALGEBRA OF VECTORS 


535 


FIGURE 12.3 
The geometrical 
interpretation of 
the vector 
product. 


(5) 


( 6 ) 


factors.* If A and B are the factors, then by definition A X B is a 
vector V whose absolute value is the product of the absolute values 
of A, B, and the sine of the angle between them, and whose direc- 
tion is perpendicular to the plane determined by A and B and so 
sensed that a right-handed screw turned from A toward B through 
the smaller of the angles between these vectors would advance in 
the direction of V (Fig. 12.3a). Since |B| |sin d\ is the projection 
of B in a direction perpendicular to A, or, in other words, is the 
altitude of the parallelogram determined when A and B are drawn 
from a common point, it follows that the magnitude of A x B, 
namely, |A| (|B| |sin 0|), is equal to the area of this parallelogram 
(Fig. 12.36). 


AxB 



From the relation between cross products and areas it is easy 
to show that cross multiplication is distributive over addition; i.e., 

Ax(B + C)=AxB + AxC 

However, since the direction of A x B is determined by the right- 
hand rule, it is clear that interchanging A and B reverses the direc- 
tion, or sign, of their product. Hence, cross multiplication is not 
commutative, and we have, in fact, 

A x B = -B X A 

Multiplication in which products obey this rule is sometimes said 
to be anticommutative. 

From the foregoing it is clear that we must be careful to 
preserve the proper order of factors in any expression involving 
vector multiplication. Moreover, if A x B = 0, we cannot con- 
clude that either A or B is zero, for this product will also vanish 
if sin 0 = 0. Hence if A x B = 0, then either at least one of the 
vectors (A,B) is zero or A and B are parallel. 

It is often convenient to be able to refer vector expressions to 


* Meaning has also been given to the symbol AB, and in fact under the name 
dyad such combinations have been extensively studied, as, for instance, in 
Gibbs-Wilson, “Vector Analysis,” Yale University Press, New Haven, Conn., 
1929. We shall not consider them in our work, however, since they are 
actually special cases of what are known as tensors, which we shall consider 
from a somewhat different point of view in the next chapter. For us, the only 
possible product-type combinations of two vectors will be the dot and cross 
products themselves. 


536 


VECTOR ANALYSIS 


CHAP. 12 


a cartesian frame of reference. To provide for this we define i, j, 
and k to be vectors of unit length directed, respectively, along the 
positive x-, y-, and 2-axes of a right-handed coordinate system. 
Then xi, t/j, and zk represent vectors of lengths x, y, and z whose 
directions are those of the respective axes, and from the definition 
of vector addition it is evident that the vector joining the origin 
to a general point P:(x,y,z) (Fig. 12.4) can be written 


FIGURE 12.4 
The representa- 
tion of a vector 
as a linear com- 
bination of the 
unit vectors 
i, i, k- 


(7) R = xi + yi + 2k 

In more general terms, any vector whose components along the 
axes are, respectively, a if a 2 , and as can be written 

A = au •+• a 2 j -j- ask 
If, further, 

B = bii -j- fe 2 j -f- b 3 k 

then A ± B = (ai + b x )i + (a 2 ± bz) j -j- (a 3 ± b 3 )k 

Clearly, two vectors will be equal if and only if their respective com- 
ponents are equal. Hence, any vector equation implies three scalar 
equations. 

Since the dot product of perpendicular vectors is zero, it 
follows that 

(8) i'j=j’k = k*i = 0 

Moreover, applying (2) to the unit vectors i, j, k, we have 

(9) i*i — j*j = k*k = 1 
Hence, if we write 

A * B = (aii -f- <123 -{- a 3 k) * (6 ii — f— &2J — f~ & 3 k) 

and use the fact that dot multiplication is distributive over 
addition [Eq. (3)] to expand and simplify, we obtain the important 
result 



( 10 ) 


A • B = afbi ajbz ~{~ afbz 


SEC. 12.1 


THE ALGEBRA OF VECTORS 


537 


( 11 ) 


( 12 ) 


(13) 


(14) 


(15) 


In particular, taking B = A, we have 
A- A = |Aj 2 = ai 2 + a 2 2 + a 3 2 


|A| — s/ai* + a 2 2 -j- a3 2 t 

On the other hand, if we write A • B = |A| |B| cos 0 and then solve 
for cos 9, using (10) and (11), we obtain the useful formula 

cog g _ Gl&l + G2&2 ~f~ a 3&3 

■\/ ai 2 + a 2 2 + a 3 2 -\/6i 2 + & 2 2 + &3 2 
a result familiar from analytic gemoetry, where the a’s and b’s were 
introduced not as the components of two vectors, but as the 
direction numbers of two straight lines. 

For the cross products of the unit vectors i, j, k we find at 

once 

ixi=jxj=kxk=0 

i X j = -j X i = k 
j Xk = -kxj = i 
kxi = -i x k = j 

Hence, using (13) and the fact that cross multiplication is dis- 
tributive over addition, we obtain for 
A x B ss (axi + a 2 j + a 3 k) X (bii + b 2 j + bsk) 
the expression 

AxB = (a 2 & 3 — a 3 fe 2 ) i — (ai& 3 — a 3 &i)j + (ai& 2 — aabi) k 
which is precisely the expanded form of the determinant 

3 k 

A xB = 


b i b 2 


The anticommutative character of vector multiplication thus cor- 
responds to the fact that interchanging two rows of a determinant 
changes the sign of the determinant. 

EXAMPLE 1 

Using vector methods, derive the law of cosines. 

To do this, let directions be assigned to the sides of the given triangle as in Fig. 12.5. 
Then C => A — B; hence, 

C*C = (A — B)-(A — B) = A- A— 2A-B+B-B 
or, using (1) and (2), 

C 2 =* A” + B- — 2 AB cos e 
which is the law of cosines. 


t Formula ( 10) is precisely the scalar product of the two vectors A = 
||oi,«*, 03 || and B = j|f>i,b 2 ,f> 3 || as we defined it from the more general point of 
view of Chap. 10 (Sec. 10.2). Similarly, Formula (11) gives the length of the 
vector A = ||ai,a 2 ,a 3 || as we defined it in Chap. 10, Sec. 10.2. 


538 


VECTOR ANALYSIS 


CHAP. 12 


r 

f: 



FIGURE 12.5 
The triangle used 
in the vector 
derivation of the 
law of cosines. 



EXAMPLE 2 

If ( x,y,z ) and (x',y',z') are two right-handed coordinate systems having a common origin, obtain 
by vector methods the transformation equations connecting the two systems of coordinates. 

To do this, let i, j, k and i', j', k' be unit vectors in the direction of the respective axes 
(Fig. 12.6), and let P be a general point in space having coordinates (x,y,z) and (x',y',z') in the 
respective systems. Now, the coordinates are simply the components of the vector OP 

along the x'-, y'-, z'- axes. Hence, if we write 
R = OP - xi -f jrj 4- zk 

and observe that the dot products of this vector with the unit vectors i', j', and k' are its com- 
ponents in these directions, we find the required formulas to be 


x' — R • i' .« (xi + yj + zk) • i' « x(i * i') + y( j • i') + g(k * i') 

y' = R-j' = (xi + yi + 2 k) • j f = x(i*j') + y(i • j') + z( k-j') 

z‘ = R • k ; = (xi + yj + rk) • k' = x(i • k') + y ( j * k') + g(k • k") 


From (1), the products (i * V), (j • i'), . . . , (k • k') are just the cosines of the angles between 
the various axes of the two systems and are known from the data of the problem. 


When we consider products involving three rather than two 
vectors, we encounter the following possibilities: 

(A • B)C A • (B X C) A x (B X C) 


The first can be dismissed with a word. In fact A • B is just a scalar, 
and thus (A • B)C is simply a vector whose length is |A • B| times 
the length of C and whose direction is the same as that of C or 
opposite to it, according as A ■ B is positive or negative. 

For the product A * (B x C), which is known as a scalar 
triple product, we observe first that the parentheses enclosing the 
vector product B x C are superfluous. There is, in fact, only one 
alternative interpretation, namely, (A • B) x C, and this is 
meaningless, since both factors in a cross product must be vectors 



SEC. 12.1 THE ALGEBRA OF VECTORS 539 



B C A 

(a) (6) (c) 


FIGURE 12.7 

The geometrical interpretation of the scalar triple product. 

whereas A • B is a scalar. Thus, no meaning but the intended one 
can be attached to the expression A - B x C ; hence, it is customary 
to omit the parentheses. 

Geometrically, the scalar triple product A • B x C represents 
the volume of the parallelepiped having the vectors A, B, and C os 
concurrent edges. For, if we regard the parallelogram having B and 
C as adjacent sides as the base of this figure, then B x C is a vector 
whose direction is perpendicular to the base and whose magnitude 
is equal to the area of the base. Moreover, the altitude of the 
parallelepiped is the projection of A on B X C (Fig. 12.7a). Hence, 
A • B x C, whose value is just the magnitude of B x C multiplied 
by the projection of A on B x C, is numerically equal to the 
volume of the parallelepiped. If 0 is less than ir/2, i.e., if A and 
B x C lie on the same side of the plane of B and C, then cos 6 is 
positive and so is the scalar triple product. In particular, changing 
the order of the factors B and C gives the product C x B, whose 
direction, of course, is opposite to that of B xC; hence 

(16) A • B x C = —A • C x B 

Since the volume of a parallelepiped is independent of the face 
chosen as its base, it follows, by applying the preceding argument 
to Fig. 12.76 and c, that B • C x A and C • A x B also give the 
volume of the same parallelepiped. From this fact, together with 
(16), we therefore find 

A - B xC = B- C xA = C'AxB 

(17) = —A • C x B = — B 4 xC = -C-BxA 

The first three arrangements can be obtained by starting anywhere 
on the circle in Fig. 12.8 and reading the letters in the counter- 
clockwise direction. For this reason they are said to be cyclic 
permutations of one another. Similarly, the last three arrange- 
ments are cyclic permutations of one another w'hich can be 
obtained by reading the letters in Fig. 12.8 in the clockwise direc- 
tion. Thus, (17) asserts that any cyclic permutation of the factors 
in a scalar triple product leaves the value of the product unchanged, 
whereas a permutation which reverses the original cyclic order changes 
the sign of the product. 



540 


VECTOR ANALYSIS 


CHAP. 12 


FIGURE 12.8 
An illustration 
of cyclic and 
anticyclic permu 
tations. 

Furthermore, since the order of factors in a dot product is 
immaterial, we find, by considering the first and third members 
of (17), that A'BxC = C-AxB = AxB-C, which shows 
that in any scalar triple product the dot and cross can be inter- 
changed without altering the value of the product. For this reason 
it is customary to omit these symbols and write a scalar triple 
product simply as [ABC]. 

If the vectors A, B, C all lie in the same plane or are parallel 
to the same plane, they necessarily form a parallelepiped of zero 
volume, and conversely. Hence, [ABC] — 0 is a necessary and 
sufficient condition that three vectors A, B, and C be parallel to 
one and the same plane. In particular, if two factors of a scalar 
triple product have the same direction, the product is zero. 

Analytically, if we write 

A - ad -}- o a j + a 3 k B = bii + b 2 j + b 3 k C = cfi + c 2 j + c 3 k 

we have 



A • B x C = (oii + a 2 j + a 3 k) • [(fc 2 c 3 - h 3 c 2 )i — (bic 3 — h 3 ci)j -f (!hc 2 - 5aCi)k] 
= ai(b 2 c 3 — b z c 2 ) — a 2 (fcic 3 — & 3 C 1 ) + a 3 ( 6 ic 2 — 6 2 ci) 

which is just the expanded form of the determinant 


(18) 


[ABC] - 


op o 2 
61 b 2 


as 

63 


Cl C 2 Cs 


The relations in (17) are thus equivalent to the familiar fact that 
interchanging any two rows in a determinant changes the sign of 
the determinant. 


EXAMPLE 3 

If A, B, and C are three vectors which are not parallel to the same plane, show that any 
vector V can be expressed as a linear combination of A, B, and C. 

If we write 

(19) V = aA+6B+cC ' 

where a, b, and c are scalar constants to be determined, and form the cross product of each 
member with the vector B, we obtain 

V X B = oA X B + 6B X B + cC X B = «A X B + cC X B 

where the term B X B vanishes because its factors are identical. Now, if we form the dot 
product of the last result and the vector C, we have 

VXB-C = aAXB-C+cCXB-C = flAXB<C 


SEC. 12.1 


THE ALGEBRA OF VECTORS 


54! 


where the term C X B * C vanishes because it is a scalar triple product with two factors 
identical. By hypothesis, A, B, and C are not parallel to the same plane. Hence, A X B * C is 
different from zero, and we can solve for a, getting 

_ [ VBC 1 
a ~ [ABC] 

In the same way we can obtain the remaining constants in the required linear combination: 
[AVC] = [ABV] 

[ABC] C [ABC] 

Thus, under the conditions of the problem, 


(20) 


V - 


[ABC] A + 


[AVC] 

[ABC] 


[ABV] 

[ABC] 


The following special case of this result is often useful: If V is any vector parallel to the 
plane determined by A and B, then [ABV] = 0 and the last term in the expansion (20) is zero. 
Hence it follows that, if A and B are vectors which are not parallel to the same line and if V is 
any vector parallel to the plane determined by A and B, then V can be expressed as a linear 
combination of A and B. 


To express the vector triple product A X (B x C) in a simpler 
expanded form, let us consider first the general case in which 
neither A, B, nor C is a zero vector and B and C are not parallel. 
Now, from the definition of a cross product it is clear that 
A x (B x C) is a vector perpendicular to A and to B x C. But 
B x C is itself perpendicular to the plane of B and C, and, thus, 
any vector such as A x (B x C) which is perpendicular to B x C 
must lie in the plane of B and C (Fig. 12.9). Hence, by Example 
3, the vector Ax (B x C) must be expressible as a linear combina- 
tion of B and C ; that is, v 

A x (B xC) = XB + mC : 

To find X and p, we first use the fact that A x (B x C) is also 
perpendicular to A and, hence, that its dot product with A must 
be zero: 

A ■ [A x (B x C)] = A • (XB + juC) = X(A • B) + p(A • C) = 0 
Thus, 

X _ AJ2 
P A * B 




542 


VECTOR ANALYSIS 


CHAP. 12 


and, therefore, 

X = p(A • C) {i = — v(k • B) 
and 

(21) Ax(BxC) = P [(A- C)B - (A • B)C] 

To find v, it is convenient to consider first the special case in 
which A = B : 

(22) B x (B x C) = Vi[(B • C)B - (B -B)C] 

Let & be the angle between B and C, and form the dot product of 
C with each side of the last equality. Then, using the properties of 
the scalar triple product and applying Formulas (1) and (2) to the 
resulting dot products, we have 

B x (B x C) ■ C = — (B x C) • (B x C) = p,[(B * C)(B • C) - (B • B)(C • C)] 
-|B x C( 2 - vi(B 2 C 2 cos 2 6 - B 2 C 2 ) 

—B 2 C 2 sin 2 6 = viB*C 2 ( cos 2 6 - 1) = ~v x B 2 C 2 sin 2 6 
Hence, j>i = 1, and (22) becomes specificially 

(23) B x (B x C) = (B • C)B - (B • B)C 

We now return to Eq. (21) and form the dot product of both 
members with B : 

A x (B x C) • B = r[(A • C)(B • B) - (A • B)(C . B)] 

Now, by an obvious rearrangement of the scalar triple product on 
the left, we have 

—A * B x (B X C) - »[(A • C)(B • B) — (A • B) (C • B)] 
or, applying Eq. (23) to the left-hand side, 

-A • [(B • C)B - (B • B)C] = v[(A • C)(B • B) - (A • B)(C ■ B)] 
which will be true if and only if v = l.f Hence, in general, 

(24) A x (B x C) = (A • C)B - (A • B)C 

Moreover, if either A, B, or C is zero or if B and C have the same 
direction, it is evident by inspection that Eq. (24) still holds. 
Hence, the restrictions we imposed upon A, B, and C at the 
beginning of our discussion can be eliminated, and Eq. (24) is 
correct in all cases. 

By a straightforward application of Eq. (24) we find that 
(A x B) x C = -Cx(AxB) = — (C • B)A + (C • A)B 

which is not equal to A x (B x C). Hence, the position of the 
parentheses in a vector triple product is significant. 

With a knowledge of scalar and vector triple products, 
products involving more than three vectors can be expanded with- 


t Unless, of course, (A • C)(B • B) - (A • B)(C • B) = 0, in which case the 
value of v is irrelevant. 



SEC. 12.1 


THE ALGEBRA OF VECTORS 


543 


out difficulty. For instance, 

(AxB).(Cx D) 

can be regarded as the scalar triple product of the vectors A, B, 
and (C x D) . This allows us to write 
A x B • (C x D) = A * [B X (C x D)] 

= A • [(B • D)C - (B • C)D] 

= (A • C) (B • D) - (A • D) (B • C) 

This result is sometimes referred to as Lagrange’s identity. 

Similarly, (A x B) X (C X D) can be thought of as the vector 
triple product of (A X B), C, and D or of A, B, and (C x D). 
Taking the former point of view and applying (24), we find 
(A X B) X (C X D) = [A X B • D]C - [A X B • C]D 
= [ABD]C - [ABC]D 

which is a vector in the plane of C and D. From the latter point of 
view, 

(A X B) x (C x D) = - (C x D) x (A x B) 

= -[C X D • B]A + [C x D • A]B 
= [CDA]B - [CDB]A 

which is a vector in the plane of A and B. These two results 
together show that (A x B) x (C x D) is directed along the line 
of intersection of the plane of A and B and the plane of C and D. 

EXERCISES 

1 For each of the following sets of vectors: 

a A = 2i — 2j + k b A = 2i - 3j + 6k c A = lOi + lOj + 5k 

B = i + 8j - 4k B = lOi + 2j + ilk B = 5i — 2j — 14k 

C = 12i - 4j - 3k C = 2i — 9j — 6k C = 4i + Tj - 41c 

what are the lengths of A, B, and C? What is A ■ B? A X C? the projection of B on C? the 

projection of C on B? the angle between A and B? [ABC]? A X (B X C)? the volume of the 
parallelepiped having A + C, A — C, and B as concurrent edges? the volume of the parallele- 
piped having A + C, A — C, and C as concurrent edges? 

2 If A, B, and C are any three vectors, prove that 

A X (B X C) + B X (C X A) + C X (A X B) = 0 

3 Prove that (A X B) • (C X D) + (B X C) • (A X D) + (C X A) • (B X D) - 0. 

4 If the plane determined by A and B is perpendicular to the plane determined by C and D, 
show that (A X B) • (C X D) = 0. 

5 Show that the volume of the tetrahedron having A + B, B + C, and C 4- A as concurrent 
edges is twice the volume of the tetrahedron having A, B, and C as concurrent edges. 

6 If A is a given vector and X • A = Y • A, can we conclude that X = Y? 

7 Are two vectors equal if they have equal components in a given direction? in two given 
directions? in three given directions? in an arbitrary direction? 

8 Find the unit vector perpendicular to both i — 2j + k and 3i + j — 2k. 

9 Find the unit vector parallel to the plane of i + j — 2k and 3i — 2j -j- k and perpendicular 
to 2i + 2j - k. 


544 


VECTOR ANALYSIS 


CHAP. 12 


10 

11 

12 

13 

14 

15 


16 

17 

18 

19 


20 

21 


24 


Show that, if the four vectors A, B, C, D are coplanar, then (AXB)X(CXD) = 0. Is 
the converse true? 

Show that, if A + B -f C = 0, then A XB = B X C = C X A. Is the converse true? 
Prove that two nonzero vectors are linearly dependent if and only if they are parallel. 
Prove that three vectors are linearly dependent if and only if they are parallel to the same 
plane. 

Prove that four vectors are always linearly dependent. [Hint: Expand (A X B) X (C X D) 
in two different ways and equate the results.] 

Prove that, for all values of the a’s and b’ s, 

(aibi + a s b s 4- aab 3 ) a Is (m 2 4- « 2 2 4- a.vW 4- fc 2 2 4- 
This is the special ease n = 3 of what is known as Cauchy’s inequality, 

Cl ■*)•* (|*4(| w) 

By considering the dot product of the two vectors A — mi 4- Oaj and B = &ii 4- 
derive the formula for the cosine of the difference of two angles. 

By considering the cross product of the two vectors of Exercise 16, derive the formula for 
the sine of the difference of two angles. 

Show that, if A = «ii 4 a 2 j 4- « 3 k is a constant vector drawn from the origin, the locus of 
the end points of the vectors R = xi 4 - VI 4 - zk which satisfy the equation (R — A) • A = 0 
is a plane perpendicular to A at its end point. What is the locus of the end points of the 
vectors which satisfy the equation (R — A) • R = 0? (R — A) • (R — A) = 0? 

The three noneollinear points L, M, and A r lie in the plane p. Prove that, if L, M, and N are 
the vectors to these points from an origin not lying in p, the vector 
'(L X M) +.(M X N) +(N XL) 
is perpendicular to p. 

Carry through in detail the geometrical proof that dot multiplication is distributive over 
addition. 

Carry through in detail the geometrical proof that cross multiplication is distributive over 
addition. 

Prove that (A X B) • (B X C) X (C X A) - [ABC] 2 . 

If A, B, and C are any three independent vectors, the vectors 


[ABC] [ABC] 


W 


A XB 
[ABC] 


are said to form a set reciprocal to the set A, B, C. Show that 


A • IT = B • V » C • W - 1 and that [TJVW] = 1/[ABC] 

If A = i 4- 2j — 2k, B = i 4- 8j 4- 4k, and C = I2i — 4j 4- 3k, express the vector 
i 4- 2 j 4- 3k as a linear combination of A, B, and C and as a linear combination of the 
vectors U, V, and W of the set reciprocal to A, B, and C. 

Show that, if A — ail 4" 4" ajk, B = f>n 4" h 2 j 4~ b 3 k, C = cii 4* Caj 4~ 03 k, and 

D = d r i 4 - c4j 4~ djs., then the system of equations 
aix 4- biy 4- c\z = di 
asx 4- b»ij 4- dz = di 
aix + b 3 y 4- c 3 z — d 3 

is equivalent to the single vector equation xA 4" .V® 4- zC = D. Assuming that [ABC] 9 * 0, 
solve this vector equation for x, y, and z, and show that the result is equivalent to that 
obtained from the algebraic form of the system by using Cramer’s rule (Theorem 7, Sec. 
10.3). 



SEC. 12.2 


VECTOR FUNCTIONS OF ONE VARIABLE 


54S 


26 In mechanics the moment M of a force F about a point O is defined as the magnitude of F 
times the perpendicular distance from the point 0 to the line of action of F. If the vector 
moment M is defined as the vector whose magnitude is M and whose direction is perpen- 
dicular to the plane of O and F, show that M = R X F, where R is the vector from 0 to any 
point on the line of action of F. Would M = F X R be equally acceptable? Explain. 


12.2 

Vector functions of one variable 


( 1 ) 


( 2 ) 


(3) 

(4) 

(5) 

(6) 


If t is a scalar variable and if to each value of t in some range there 
corresponds a value of a vector V, we say that V is a vector function 
of t. Since the component of a vector in any direction is known 
whenever the vector itself is known, it follows that, if V is a 
function of t, so, too, are its components in the directions of the 
unit vectors i, j, and k. Hence, we can write 


V(i) = Vi(*)i + 7,(05 + F 3 (0k 

In particular, we say that V(f) is continuous if and only if the 
three scalar functions Vi(t), V^(t), and F 3 (t) are continuous. 

If the independent variable t of a vector function V (t) changes 
by an amount At, the function will in general change both in 
magnitude and in direction. In other words, corresponding to the 
scalar increment At we have the vector increment 


AV = V(t + At) - V(t) 

— [V .+ A£) i -j- F 2 (f.+ Ai)j + V z(t -f Af)k] 

- [Vi(t) i + 7,(05 + 7,(i)k] 

= AFii + AF 2 j + AF s k 


dV 

dt 


By the derivative of a vector function V (J ) , we mean, as usual, 


lim 


V(f + At) - V(Q 


lim 


AV 


or, using (2), 
dV 
dt 


v AFj. r AF 2 . . ,. AF 3i 

lim -7-7- 1 + lim — r — j -f- lim - k 
*0 At aj_>o At 


At 


dVi. dV-i 
dt 1 + dt 


i+^ 3 k 
J ■ Vdt k 


From (3) we are motivated to define the differential of a vector 

function V(0 to be 

dV = dFd + dF 2 j + dF 3 k 

In particular, for the very important vector 

R = .ri + yj + zk 

drawn from the origin to the point ( x,y,z ), we have 
dR = dx i + dy j + dz k 


546 


VECTOR ANALYSIS 


CHAP. 12 


(7) 

(8) 

(9) 

( 10 ) 
( 11 ) 
( 12 ) 


From the definition of the derivative of a vector function of 
one variable it follows that sums, differences, and products of 
vectors can be differentiated by formulas just like those of ordinary 
calculus, provided that the proper order of factors is maintained 
wherever the order is significant. Specifically, we have 


d( U ± V) = dTJ dV 

dt dt~~dt 

d(4>V) _ d<t> dV 
~ST + 
jffi • V) iO ,,„iV 

dt it ^ dt 

*S£2-SxV+TTx£ 

dt dt dt 

dV\. 


d[UVWj 

dt 

d[U X (V x W)] _ 
dt 


L dt 

dU 


1 + 

/dV 


, dWl 


] + [ n 

x(V*W)+I«(^f) + Ux(vxf) 


The simplest example of a vector function of one variable is 
the set of vectors drawn from the origin to the points of a curve 
C on which the scalar variable t is a parameter. For a general 
point on C is associated with a unique value of the parameter, 
say f = t%, and determines with the origin a unique vector V (ti) 
(Fig. 12.10a). This correspondence between the values of t and 
the vectors V(t) is clearly a vector function of t according to our 
definition. Conversely, if the values of a continuous vector func- 
tion V(t) are drawn from a common origin, their end points will 
define a curve C whose points will be in correspondence with the 
values of the scalar variable t. 

This point of view leads to an important geometric interpreta- 
tion of the derivative dV/dt. For, since At is just a scalar, the 
quotient AV/A< is a well-defined vector having the same direction 
as AV itself. Moreover, as Fig. 12.106 shows, the direction of AV 
is that of an infinitesimal chord of the curve C. Therefore, as At 
approaches 0, the direction of AV, and, hence, the direction of 
AV/At, approaches the direction of a tangent to C. That is, dV /dt 





SEC. 12.2 


VECTOR FUNCTIONS OF ONE VARIABLE 


547 


is a vector tangent to the curve C which is the locus of the end 'points of 
the vectors V(i !). In particular, if the scalar variable t is taken to be 
the arc length s of C, measured from some reference point on C, we 
> have 

I dV I _ | AV I infinitesimal chord of C _ ^ 

I ds j -a s — o I As I Ac— +o infinitesimal arc of C ~ 

Hence, if s is the arc length of the curve C defined hy the end points of 
the vectors V(s), then dV/ds is a unit vector tangent to C. 

EXAMPLE 1 

At what point or points is the tangent to the curve x = t z , y - 5 z = lOi perpendicular to the 
tangent at the point where t = 1 ? 

From our earlier discussion it is clear that the given curve is equivalent to the vector 
function 

V(«) * ( 3 i + 5f, a j 4- 106k 

Moreover, the tangent to this curve at a general point t is 

? « 3< 2 i + lQfj + 10k 
at 

and, in particular, at t ■» 1 the tangent is 
3i + lOj + 10k 

Using the fact that two nonzero vectors are perpendicular if and only if their dot product van- 
ishes, it follows that the tangent at a general point t will be perpendicular to the tangent at the 
point t — 1 if and only if 

3(3 6 a ) + 10(106) + 10(10) ■ 96 s + 1006 + 100 - 0 

This condition holds for the two values t — — l %, —10. Hence, evaluating the x, y, and z 
coordinates of the points with these parameters, it follows that the tangent at 

/ _ 1,000 . 500 _ 100\ 

\ 729 * 81 ’ 9 / 

and the tangent at ( — 1 ,000, 500, —100) are both perpendicular to the tangent at t — 1 and that 
these are the only points with this property. 

EXAMPLE 2 

Discuss from the point of view of vector analysis the problem of the determination of the velocity 
and acceleration of a particle moving along a curve C. 

To do this, let us suppose that the path C, which is the locus of the instantaneous positions 
of the moving particle, is defined by the vector function P(<), where t is the time. In other words, 
P(f) is the vector drawn from the origin to the position of the moving particle at the general time t. 
Now let s be the are length of C. Then, by the chain rule, we can write 

( ,,, 

(It ds dt 

Since ds/dt is the speed v of the moving particle and since dP/ds is a unit vector tangent to the 
path of the particle, it follow's from (13) that the vector 



( 14 ) 


548 


VECTOR ANALYSIS 


CHAP. 12 


agrees both in magnitude and in direction with the velocity of the particle and, thus, can properly 
be called its vector velocity. 

Moreover, if we define the vector acceleration of the particle as the time derivative of its 
vector velocity and, for convenience, denote the general unit vector tangent to C, namely, dP/ds, 
by the symbol T, so that (14) becomes 


we can write 


v — »T 


(15) 


dv _ d(v T) _ d» „ , dT ■ _ dT ds 

dt dt dt V dt dt V ds dt 

dt ds 


In the first term on the right in (15) the scalar quantity dv/dt is the rate of change of the 
tangential speed v. Therefore, since T is by definition a unit vector tangent to C, the product 
(dv/dt) T is in magnitude and direction just the tangential acceleration of the moving particle. 

To interpret the second term on the right in (15), we observe that, since T is a unit vector, it 
can vary only in direction. Hence, if the various values of T are drawn from a .common origin, 
the locus of their end points will be a curve on a sphere of unit radius. Now the length of the 
increment AT (Fig. 12.116) is approximately the length of the arc A'B', which in turn is equal to 


FIGURE 12.11 
The unit tan- 
gents to a space 
curve plotted 
from a common 
origin. 


(a) (6) 

&9, where A0 is the angle between the tangents to C at the points A and B, a distance As apart 
(Fig. 12.11a). Hence, 

I dT | jAT| angle between tangents to C at A and B 

I ds [ As—>o |As| arc length along C between A and B 

— curvature of C 

Moreover, from Fig. 12.116 it is evident that- the limiting direction of AT is perpendicular to T 
in the plane which T and T + AT determine in the limit. If C is a plane curve, this, of course, is 
the unique plane in which C lies. If C is a twisted curve, this plane, which is known as the 
osculating plane, will vary from point to point along C. Hence, to summarize, dT/ds is a vector 
perpendicular to C in the osculating plane of C whose magnitude is equal to the curvature K of C. 

If, finally, we let N denote a unit normal drawn toward the concave side of Cin the osculating 
plane and define the radius of curvature of C as 



T+AT 




SEC 12.2 


VECTOR FUNCTIONS OF ONE VARIABLE 


549 


we can write Eq. (15) in the form 
dv 


(16) 


-T H — N 


which shows that at any point in its path, the vector acceleration of a moving particle is the sum of a 
component of magnitude dv/dt along the tangent to the path and a component of magnitude v"/p 
normal to the path in the osculating plane to the path. 

EXERCISES 

1 If P = A cos kt + B sin kt, where A and B are arbitrary constant vectors, show that 
P X dR/dt is a constant and that drP/dt 2 + Jfc 2 P = 0. 

d 2 P 


2 If P is any vector, show that 


d / dP\ 


P X 


dt 2 


What is the derivative (a) of U 


dJJ d 2 U 


dt dt 2 

If V is an arbitrary vector function of t, is |dV| = 


r (d\5 d 2 Tj\ 

,x U x ^/ 


d|V|? 


dV dV 

If V is an arbitrary function of t, show that V • — = V - 

dt ' 


dt 


= t 3 at the points 


6 What is the angle between the tangents to the curve x — t, y - 
where t — 1 and t « — 1 ? 

7 Show that there are no pairs of points on the curve x = t, y — t-,z = I s at which the tan- 
gents are parallel . Are there such pairs of points on the curve x = 3Z 4 — 6 <* + 12/,?/ = 4Z 3 — 
6 / 2 , 2 = 12/? on the curve x - 15 t, y = 5t 3 , 2 = 15/ + 3Z S ? 

8 If R a / 2 i — i 3 j + t 4 k is the vector from the origin to a moving particle, find the resultant 
velocity of the particle when ( = 1. What is the component of this velocity in the direction 
of the vector 8i — j + 4k? What is the vector acceleration of the particle? What are the 
tangential and normal components of its acceleration? 

9 If a particle starts to move from rest at the point (0,1,2) with component accelerations 
Ox = 1 + 2i, a-j ~ t 3 , a, = 2 1 — t 2 , find the vector from the origin to the instantaneous 
position of the particle. 

10 If Ri, Rj, . . . , R„ are the vectors from the origin to the respective mass particles mi, 
m 2 , . . . , m„, the end point of the vector 

X 



is called the center of gravity of the system of particles. Show that, for any vector R, 

T m,(R - R t ) • (R - R.) = m(R - C) • (R - C) + f. m.(C - R ; ) • (C - Ri) 

i=i i=t 

where m is the total mass of all the particles. 

11 If a particle moves under the influence of a force F which is always directed toward the 
origin, show that R X d'R/dt 2 = 0, where R is the vector from the origin to the particle. 
(Hint: Newton’s law, i.e., mass X acceleration = force, remains correct when the accelera- 
tion and the force are. interpreted as vector quantities.) 

12 If R (t) is the vector from the origin to the instantaneous position of a particle moving along 
a curve C, show that R X rfR is equal to twice the area of the sector defined by the two 
vectors R(t) and R(Z + dt) = R + </R and the arc of C which they intercept. Hence show 
that (a) if R X d'R/dt — 0, the vector R has a constant direction, and that (b) if R X 


550 


VECTOR ANALYSIS 


CHAP. 12 


d-R/dt 2 - 0, the particle moves so that the radius vector R sweeps out equal areas in equal 
times. [Property b is a generalization of one of the laws of planetary motion discovered by 
Johannes Kepler (1571-1630).] 

13 If T is a unit vector tangent to C and if N is the unit normal to C in the osculating plane, the 
vector B = T X N is called the binomial to C at the point where T and N are drawn. Using 
the fact that dT/ds — N/p, show that dB/ds = T X dN/ds and hence that dB/ds has the 
same direction as N. (The absolute value of dB/ds is called the torsion of the curve C and 
measures the rate at which the osculating plane turns as we move along C.) 

14 What is the equation of the osculating plane to the space curve % ~ t\ y ~ t\ z = i 3 at the 
point Pi:(xi,yi,zi) whose parameter is t = t/! [Hint: Let P:(z,y,z) be a general point in the 
osculating plane, and impose the condition that the vector joining P to Pi be coplanar with 
the vectors T and dT/dt at Pi.] 

16 What is the osculating plane to the curve x — t,y — t 2 ,z — l 3 at the point whose parameter 
is t'l What is the normal to this curve at the point t = 1? What is the binormal at t = 1? 


12.3 

The operator V 

Let 4>{x,y,z) be a scalar function of position possessing first partial 
derivatives with respect to x, y, and z throughout some region of 
space, and let R = xi + y) + zk be the vector drawn from the 
origin to a general point P:(x,y,z). If we move from P to a 
neighboring point Q:(x + Ax, y -f Ay, z + A z) (Fig. 12.12), the 
function <j> will change by an amount A <f> whose exact value, as 
derived in calculus, is 

(1) A<f> = ^ A* -f ~ Ay -f ~ Az + Ax + e 2 Ay + e 3 A z 

where e'i, t 2 , e 2 are quantities which approach zero as Q approaches 
P, i.e., as Ax, Ay, and Az approach zero. If w r e divide the change 
A<f> by the distance As s |AR| between P and Q, we obtain a 
measure of the rate at which <f> changes when we move from P 
to Q: 


( 2 ) 


A<j> __ dfj> Ax 34> Ay d(j> Az Ax , Ay Az 

As 3x As 3y As + 3z As + 61 As + 62 As + 63 As 

For instance, if 4>(x,y,z) is the temperature at the general point 
P :(x,y,z) then A<f>/ As is the average rate of change of temperature 
in degrees per unit length at the point P in the direction in which 
As is measured. The limiting value of A<j>/As as Q approaches P 


FIGURE 12.12 
The coordinate 
vectors of two 
neighboring 
points. 




SEC. 12.3 


THE OPERATOR V 


551 


(3) 


(4) 


(4a) 


along the segment PQ is called the derivative of tf> in the direction 
PQ or simply the directional derivative of <j>. Clearly, in the limit 
the last three terms in (2) become zero and we have explicitly 
d<f> _ d<j>dx d<j> dy d<j> dz 

ds dx ds dy ds dz ds 

The first factor in each product on the right in (3) depends 
only on <j> and the coordinates of the point P at which the deriva- 
tives of 4> are evaluated. The second factor in each product is 
independent of (j> and depends only on the direction in which the 
derivative is being computed. This observation suggests that 
d$/ds; can be thought of as the dot product of two vectors, one 
depending only on 4> and the coordinates of P, the other depending 
only on the direction of ds; and in fact we can write 



The vector function 


dx + dy 3 h dz 


is known as the gradient of 4> or simply grad <£, and in this nota- 
tion (4) can be rewritten in the form 


d<t> , , . dR 

To determine the significance of grad <f>, we observe first that, 
since As is by definition just the length of AR, it follows that dR/ds 
is a unit vector. Hence the dot product (grad <t>) • (dR/ds) is just 
the projection of grad <p in the direction of dR/ds. Thus, according 
to (4a), grad <j> has the property that its projection in any direction 
is equal to the derivative of 4> in that direction (Fig. 12.13a). Since 
the maximum projection of a vector is the vector itself, it is clear 
that grad <f) extends in the direction of the greatest rate of change of <j> 
and has that rate of change for its length. 

If we set <i>(x,y,z) - c, we obtain, as c takes on different 



Grad <6 



<£ = c 

(b) 


552 


VICTOR ANALYSIS 


CHAP. 12 


-<5) 


(6) 

(7) 

(8) 

(9) 


values, a family of surfaces known as the level surfaces* of 4>, 
and, on the assumption that is a single-valued function, one and 
only one level surface passes through any given point P, If we now 
consider the level surface of $ which passes through P and fix our 
attention on neighboring points Q which lie on this surface, we 

have ~ = 0, since = 0 because, by definition, 4> has the same 
value at all points of a level surface. Hence, by (4a), 

<grad * )- fb° 


for any vector dR/ds which has the limiting direction of a secant 
PQ of the level surface. Clearly, such vectors are all tangent to 
= c at the point P; hence, from the vanishing of the dot 
product in (5) it follows that grad 4> is perpendicular to every 
tangent to the level surface at P. In other words, the gradient of 4> 
at any point P is perpendicular to the level surface of <f> which passes 
through that point (Fig. 12.136). Evidently, grad 4> is related to the 
level surfaces of <£ in a way which is independent of the particular 
coordinate system used to describe 4>. In other words, grad 4> 
depends only on the intrinsic properties of <f>. It follows, therefore, 
that in the expression 


grad , = | t+ || j + || k 


i, j, and k can be replaced by any other set of mutually perpen- 
dicular unit vectors provided that — > — > — are replaced by the 
dx dy dz 

directional derivatives of <j> along the new axes. 

The gradient of a function is frequently written in operational 
form as 

*"“*+■-(» l +i |; +k l)* 

The operational "vector” thus defined is usually denoted by the 
symbol V (read “del”) ; 


■JL 

1 dx 


. d . t d 
} dy + k dz 


In this notation our earlier results can be wi’itten 
grad cf> - Vcf> 


dtf> 

ds 


V<p- 


dR 

ds 


d<f> = • dR 


* This name, which is used regardless of the number of independent variables, 
is suggested by the analogy between the general case and the two-dimensional 
topographic interpretation in which <t>(x,y) is the elevation at the point (x,y) 
and the loci <f>(x,y) = c are the contour lines, i.e., curves consisting of points 
where the elevation above (or below) the xy - plane is constant. 



SEC. 12.3 


THE OPERATOR V 


553 


(10) 


Also, if 4> is a function of u and u is a function of x, y, arid g, then 


Vcj) 


dcf> . 5<f) . 

: to 1+ aV 


b du . , dd> du . , dd> du , 

- — 1 + jLWTA + T-— 1 


du \ 


du dy J 
du . , du . . i 
, + ^ ) + ; 


du dz 


d<j> 


EXAMPLE 1 

What is the directional derivative of the function 4>(x,y,s ) = an/ 2 + yz 3 at the point (2, —1,1) 
in the direction of the vector i + 2j -(- 2k? 

Our first step must he to find the gradient of <j> at the point (2, — 1,1). This is 

B(xy* + yz s ) . , c )(xy- + yz 3 ) . d{xy* + yz 3 ) 1 

V<j> = i -| j H k 

dx By dz |2,-1,1 

= y*i + (2 xy + s 3 )j + 3 yz-k 

= i - 3j - 3k 


The projection of this in the direction of the given vector will be the required directional 
derivative. Since this projection can be found at once as the dot product of V<j> and a unit vector 
in the given direction, we next reduce i + 2j + 2k to a unit vector by dividing it by its magnitude, 
getting 


i -f- 2j -j- 2k 
Vi +4 + 4 


1 2.2 

-r + 3>+5 t 


The answer to our problem is, therefore, 


V<f> • (Hi + Hi + %k) = (i — 3 j - 3k) • (Hi + Hi + Hk) = - l H 


The negative sign, of course, indicates that <f> decreases in the given direction. 


EXAMPLE 2 

What is the unit normal to the surface xy 3 z 2 = 4 at the point (-1,-1, 2)? 

Let us regard the given surface as a particular level surface of the function <f> = xyh". Then 
the gradient of this function at the point ( — 1, —1,2) will be perpendicular to the level surface 
through (-1,-1, 2), which is the given surface. When the gradient has been found, the unit 
normal can be obtained at once by dividing the gradient by its magnitude: 

V<#> = y s z- i + 3xy 2 a 2 j + 2xy 3 zk |_j _ 1 2 - — 4i — 12j + 4k 

|Vtf>| = V 16 + 144 +16 = 4 Vll 

V0 _ — 4i - I2j + 4k _ 1 . _ 3 . + 1 k 

\v<t>\ 4 vn vn vn Vn 

It may be necessary to reverse the direction of this result hy multiplying it by — 1, depending on 
which side of the surface we wish the normal to extend. 

The vector character of the operator V suggests that we also 
consider dot and cross products in which it appears as one factor. 


554 


VECTOR ANALYSIS 


CHAP. 12 


(ID 


( 12 ) 


If F = F ii + F 2 j + Fz k is a vector whose components are func- 
tions of sc, y, and z, this leads to the combinations 

v - F A i l + i | +k l)- (F ' i + F ’ i + F * ) 

— ^Fi i dF% , dFj 
~ lx ly Hz 

which is known as the divergence of the vector F, and 

vx, -( l E + J& +k 5)*< H+ ™ + w 
= i ( d h _ b Jj\- i ( d Jj _ d Ji\ 4. 

\ dy dz) * \ dx dz ) \ dx dy ) 

i 3 k 

d_ A A 

dx dy dz 
Ft F 2 F s 

which is known as the curl of F. 

Both the divergence and the curl admit of physical interpreta- 
tions which justify their names. For instance, to illustrate the 
significance of the divergence, consider a region of space filled 
with a moving fluid, and let 

v = yji + vd 4- w 3 k 

be a vector function representing at each point the velocity with 
which the particle of fluid instantaneously at that point is moving. 
If we fix our attention on an infinitesimal volume (Fig. 12.14) in 
the region occupied by the fluid, there will be flow through each 
of its faces, and as a result the amount of fluid within the element 
may vary. To measure this variation, let us compute the loss of 
fluid from the element in the time Af. 

Now, the volume of fluid which passes through one face of 
the element AF in time AMs approximately equal to the com- 
ponent of the fluid velocity normal to the face times the area of 
the face times At, and the corresponding mass flow is, of course, 
the product of this volume and the density of the fluid p. Hence, 
computing the loss of fluid through each face in turn, remembering 
that since the fluid is not assumed to be incompressible the density 



FIGURE 12.14 
A typical volume 
element in a 
region tilled with 
a moving fluid. 


d(pn 2 ) 

dy 


Ay 



SEC. T2.3 


THE OPERATOR V 


555 


as well as the velocity may vary from point to point, we have 

Right face: j^py 2 + Ayj Az Az At 

Left face : — pv 2 Ax A z At 

Front face: -f — AzJ Ay A z At 

Rear face : — pv 1 Ay Az At 

Top face: JTpy 3 -f- Azj Az Ay At 

Bottom face : — py 3 Az Ay At 

If we add these and convert the resulting estimate of the absolute 
loss of fluid from AF in the interval At into the loss per unit 
volume per unit time by dividing by AF At m Ax Ay Az At, we 
obtain in the limit 

Rate of loss per unit volume = 4- + -7 — - 

y dx dy dz 

which is precisely the divergence of the vector pv. Thus fluid, 
mechanics affords one possible interpretation of the divergence as 
the rate of loss of fluid per unit volume. 

If the fluid is incompressible, there can be neither gain nor 
loss of fluid in a general element. Hence, since the density p is 
constant for an incompressible fluid, we must have 

V • (pv) = pV • v = 0 or V • v = 0 

which is known as the equation of continuity for incompressible 
fluids. However, if AF encloses a source of fluid, then there is a 
net loss of fluid through the surface of AF equal to the amount 
diverging from the source. Similar results, of course, hold for such 
things as electric and magnetic flux, which exhibit many of the 
properties of incompressible fluids. 

To find a possible interpretation of the curl, let us consider a 
body rotating with uniform angular speed w about an axis l. Let 
us define the vector angular velocity a to be a vector of length u, 
extending along l in the direction in which a right-handed screw 
would advance if subject to the same rotation as the body. 
Finally, let R be the vector drawn from any point 0 on the axis l 
to an arbitrary point P in the body* 

From Fig. 12.15 it is evident that the radius at which P 
rotates is |R| • |sin 6\. Hence, the linear speed of P is 

|v| = «|R| * |sin B\ = |Q| |R| • |sin 0| = |Q X R| 

Moreover, the vector velocity v is directed perpendicular to the 
plane of £& and R, so that £J, R, and v form a right-handed system. 
Hence, the cross product QxR gives not only the magnitude of 
v but the direction as well. 


556 


VECTOR ANALYSIS 


CHAP. 12 



Now, if we take the point 0 as the origin of coordinates, we 
can write 

R = xi + yj + 2k and £i = Qd 4- Gsj + 03k 

Hence, the equation v = Si x R can be written at length in the 
form 

v = (0 & — Sl 3 y) i — (fljz — fl 3 a;)j + (Q <y — i2 2 a:)k 


If we take the curl of v, we therefore have 

i j k 

V X v = — — — 

dx dy dz 

fl 2 2 — S2 3 i/ — (fii2 — Q 3 x) S2it/ — OjX 

Expanding this, remembering that Si is a constant vector, we find 

Vxv = 20ii -f 2fl 2 j + 2f2 3 k = 2Si 

or Si = x v 

The angular velocity of a uniformly rotating body is thus equal 
to one-half the curl of the linear velocity of any point of the body. 
The aptness of the name curl in this connection is apparent. 

The results of applying the operator V to various combinations 
of scalar and vector functions can be found by the following 
formulas:* 

(13) V • (<£v) = <£V * v 4~ v * V<£ 

(14) V x (<£v) = x v -f- (V<£) X v 

(15) ■ V • (u x v) = v • V xu — U'Vxv 

(16) V x (u x v) = v • Vu - u • Vv + uV • v - vV • u 

(17) v(u • v) = u • Vv -f v • Vu + u X (V X v) + v X (V X u) 

(18) V X - 0 

(19) ' V • V x v == 0 

(20) V x (V x v) = V(V • v) - V * Vv = V(V • v) - V 2 v 

* We must remember, of course, that these results are correct only for the 
cartesian form, of the operator V given by Eq. (6). Different formulas arise 
when V is expressed in terms of more general coordinate systems. 


SEC. V2.3 


THE OPERATOR V 


5 57 


These identities can all be verified by direct expansion. For 
instance, to prove (13), we have 


V • (0v) = V • [<£(«ii + y 2 j + y 3 k)] 

= dpjtoi) d(<f>v 2 ) , d(fr a) 

dx dy dz 


dvi . d(f>. dv 2 . dtj) 


— ’ + v, y 
dz ^ 3 dz 


which, on regrouping, is simply 


<t>V • v + v • V<f> as asserted. 

In general, however, it is easier to establish formulas like 
those in the above list by treating V as a vector, manipulating the 
expressions according to the appropriate formulas from vector 
algebra, and finally giving V its operational meaning. Since V is a 
linear combination of scalar differential operators which obey the 
usual product rule of differentiation (that is, act on the factors in 
a product one at a time) it is clear that V itself has this property. 
In other words, we can apply V to products of various sorts by 
assuming that each of the factors in turn is the only one which is 
variable and then adding the partial results so obtained. As a 
notation to aid us in determining these partial results, it is helpful 
to attach to V, whenever it is followed by more than one factor, a 
subscript indicating the one factor upon which it is currently 
allowed to operate. 

To prove (14), using the second, more formal procedure, we 
suppose first that the scalar function $ is a constant; that is, we 
let V operate only on the vector v. Then we can write 

V, x (<£v) =f xv 

where the subscript v has been omitted from the right-hand side, 
since it is always completely clear what V operates on when it is 
followed by just one factor. Similarly, if we regard v as constant 
and <t> as variable, we have 

V* X (0v) = (V<£) X v 

where the parentheses now restrict the effect of V to the factor <f> 
alone and so make a subscript, on V unnecessary. Finally, adding 
our two partial results, we have 

V, X (0v) + V* X (0v) = V X (<f>v) = X v + (V*) X v 

To prove (15), we have, from the cyclic properties of scalar 
triple products, 

V u • (u x v) = v • V x u and V„ • (u X v) = — u • V X v 
Hence, adding these two partial results, we find 
V • (u x v) = v*Vxu— u*VXv 


558 


VECTOR ANALV5IS 


CHAP. 12 


To prove (16), we have 

V„ X (u X v) = (V„ • v)u - (V„ • u)v = v • Vu — W • u 
and V„ X (u X v) = (V* * v)u — (V„ • u)v = uV • v — u • Vv 
Hence, adding, 

V u x (u x v) -f V„ X (u X v) = V x (u X v) 

— v • Vu — u • Vv + uV • v — vV • u 
To prove (17) we note that 

u X (V x v) 63 u X (V* x v) - (u • v)V„ - (u • V)v 
= V„(u * v) — u * Vv 

and v x (V x u) = v x (V„ x u) - (v • u)V„ — (v • V)u 
= V„(u * v) — v • Vu 
Hence, transposing and adding, we find 
V«(u * v) + V,(u • v) ss V(u • v) 

= u x (V X v) 4- v x (V x u) 4- u • Vv 4- v * Vu 
Without explicit expansion we infer that (18) is correct, since 
the operational coefficient of 4>, namely, V X V, is, in effect, a cross 
product of identical factors and hence zero. Similarly, without 
expansion, we infer that (19) is correct, since V • V X v is a scalar 
triple product containing two identical factors and hence is zero. 

To establish (20), we merely apply the usual rule for expand- 
ing a vector triple product: 

V x (V x v) = (V • v)V - (V • V)v = V(V • v) - V 2 v 

where the conventional symbol V 2 has been substituted for the 

second-order operator 



j2l-L.jH_L.il 

dz 2 + dy 2 + dz 2 


EXERCISES 

In the following exercises R = a:i + yi + zk, as usual, and r = |R| = s/ x 2 4- y 2 4- z 2 . 

1 Prove that V X R = 0. What is V • R? 

2 If A. is an arbitrary constant vector, prove that V(A • R) = A. What is (A • V)R? 

3 Compute (a) the divergence and (b) the curl of the vector 

xyzi + 3x s yj + (xz 2 - y 2 z)k 

4 What is the directional derivative of the function 2xy + z 2 at the point (1, — 1,3) in the 
direction of the vector i 4- 2 j + 2k? 

6 What is the unit normal to the surface z = x 2 -f- y 2 at the point (1, —2,5)? 

6 What is the angle between the normals to the surface xy = z 2 at the points (1,4, —2) and 
(-3, -3,3)? 

7 Verify Eqs. (14) and (15) by direct expansion. 

8 What is the generalization of Eq. (10) to the case in which <£ is a function of u, v, and w, 
where u, v, and w are each functions of x, y, and z? 


SEC. 12.4 


LINE, SURFACE, AND VOLUME INTEGRALS 


559 


9 Prove that the curl of any vector whose direction is constant is perpendicular to that 
direction. 

10 Prove that (A X V) X R = -2A. What is (A X V) • R? 

11 Prove that V • [(A X R)/r] =0 for any constant vector A. 

( A X R\ A A • R 

j 1 — R, for any constant vector A. 

13 Prove that Vr n = nr"~ 2 R. 

14 For what values of n is V 2 r n = 0? 

16 Determine n so that V ■ (r n R) will vanish identically. 

16 Prove that the curl of f(r) R is identically zero. 

17 Prove that V<£i X V0 2 = V X (<f> i Vfc) = -VX (<*>s V0i). 

18 If u = x + y + z, o = x + y, and w = —2 xz — 2 yz — z 2 , show that [Vu Vv Via] = 0. 

19 If three functions u, v, and w are connected by a relation f(u,v,w) — 0, prove that 
[Vm Vv Vw] ~ 0. (Hint : Consider the dot product of Vf and Vu X Vv.) 

20 If Vi and V 2 are the vectors which join the fixed points P \.{x\,y\,zi) and Pi'.ix^y^zi) 
to the variable point P:(x,y,z), prove that the gradient of Vi • V 2 is V ( + V 2 . What is 
V • (Vi X V,)? What is V X (V t X V*)? 


12.4 

Line, surface, and volume integrals 

In the rest of our work in vector analysis and in much of the work 
ahead of us in the chapters on complex variables, a simple exten- 
sion of the familiar process of integration known as line integration 
will be of fundamental importance. Although in vector analysis 
we are usually concerned with line integrals taken along space 
curves, it is convenient to begin our discussion with a considera- 
tion of line integration along plane curves, since the applications 
of line integration in our study of complex variables will be 
exclusively in two dimensions. In both the two-dimensional and 
the three-dimensional case, our work will involve only continuous 
curves which are sectionally smooth; that is, curves which are 
continuous and consist of a finite number of arcs on each of which 
the tangent changes continuously. Clearly, such curves can have 
at most a finite number of “corners” where the direction of the 
tangent changes abruptly. Moreover, as we learned in calculus, 
the length of such a curve between any two of its points is finite. 

Let F(x,y) be a function of x and ?/, and let C be a continuous, 
sectionally smooth curve joining the points A and B.f Further- 
more, let the arc of C between A and B be divided into n segments 
A Si whose projections on the x~ and y-axes are, respectively, Ax% 
and A]ji, and let be the coordinates of an arbitrary point in 
the segment A (Fig. 12.16). 

If we evaluate the given function F(x,y) at each of the points 


t F(x,y) bears no relation to the equation of C and is merely a function 
defined at every point of the portion of the curve C under consideration. 


S60 


VECTOR ANALYSIS 


CHAP. 12 


FIGURE 12.16 
The subdivision 
of an arc pre- 
paratory to the 
definition of a 
line integral. 



and form the products 
F(&w) A;r t Fib, 7?,) A yi F(fc,w) As, 

and then sum over all the subdivisions of the arc AB, we have 
the three .sums 


X Axi X A y, X TO,*) ASi 

» = i » = i » = i 

The limits of these sums, as n becomes infinite in such a way that 
the length of each As,- approaches zero, are known as line integrals 
and are written, respectively, 

f c F(x,y) dx j c F(x,y) dy J c F(x,y ) ds 

It can be shown* that the continuity of F(x,y ) is a sufficient con- 
dition for the existence of the limits which define these integrals. 

In these definitions, A.r; and Ay,- are signed quantities, whereas 
As, is intrinsically positive. Thus the following properties of 
ordinary definite integrals: 
rB rB 

a / c0(O dt — e f A (f>(t) dt c a constant 

b f* [0i(O + MW dt - f* 0i(O dt + j* dt 

c J A 4>(t) dt = — f * 0(0 dt 

d J A 0(0 dt +• J p 0(0 dt — j A dt 

are equally valid for line integrals of the first two types, provided 
that throughout each formula the curve joining A and B remains 
the same. On the other hand, line integrals of the third type, 


* See, for instance, D. V. Widder, “Advanced Calculus,” p. 187, Prentice- 
Hall, Inc., Englewood Cliffs, N.J., 1947. 


SEC. 12.4 


LINE, SURFACE, AND VOLUME INTEGRALS 


561 


although they do have properties a and b, do not have property 
c, since, in fact, 

f A F(.x,y) ds = f* F{x,y) ds 

Moreover, property d holds for these integrals if and only if P is 
between A and B on the path of integration. In general, we shall 
be much more interested in integrals of the first two types than in 
integrals of the third type. 

Much of the initial strangeness of line integrals will disappear 
if we observe that the ordinary definite integrals of elementary 
calculus are just line integrals in which the curve C is the x-axis 
and the integrand is a function of x alone. Moreover, the evalua- 
tion of line integrals can be reduced to the evaluation of ordinary 
definite integrals, as the following example shows. 


EXAMPLE 1 
[B 1 

What is the value of / dx along each of the three paths shown in Fig. 12.17? 

JA x + y 



Before this integral can be evaluated, it is necessary that y be expressed in terms of x. To 
do th's, we recall from the definition of a line integral that the integrand is always to be evaluated 
along the path of integration. Along y = x 2 this gives us the ordinary definite integral 

liTTT*’ (1 + l)i:_la l 


Along AP the integral is obviously zero, since x remains constant. Along PB, 
we have the integral 


f 2 dx 

h z + 4 


= [In (x + 4)lf = In •- 


which y — 4, 


which is thus the value of the integral along the entire path APB. Along AQ, on which y — 1, 


562 


VECTOR ANALYSIS 


CHAP. 12 


we have the integral 


/2 dx 

h x + 1 


[In (x + l)]j = ln- 


Along QB the integral is again zero. Hence, along the entire path AQB the value of the integral 
is In %. 

This example not only illustrates the computational details of line integration, but also 
shows that in general a line integral depends not only on the end points of the integration but 
also upon the particular path which joins them. 


It is possible, as in the ease of ordinary integration, to 
interpret a line integral as an area. For, if we think of the integrand 
function F(x,y) as defining a surface over the xiy- plane, then the 
vertical cylindrical surface standing on the arc AB as base, or 
directrix, will cut the surface z — F(x,y) in some curve such as 
PQ in Fig. 12.18. This curve is clearly the upper boundary of the 



portion ABQP of the cylindrical surface which lies above the 
.ry-plane, below the surface z = F(x,y), and between the gen- 
erators AP and BQ. Moreover, the product F(^,^) As,- is approxi- 
mately the area of the vertical strip of this portion of the surface 
which stands above the infinitesimal base As *. Hence, the sum 

| F(tw) Asi 

is approximately equal to the curved area ABQP, and, in the 
limit, the integral 

J c F(x,y) ds 

gives this area exactly. 

In a similar fashion, the product F(|i,ijt) Ax, is approximately 
the area of the projection on the .rz-plane of the vertical strip 


SEC. 12.4 


LINE, SURFACE, AND VOLUME INTEGRALS 


563 


standing on As i) the sum 

represents approximately the area of the projection on the xz- 
plane of the entire curved area ABQP; and, in the limit, the 
integral 

f c F(x,y) dx 

gives the projected area exactly. In the same way the integral 

f c F (%,y) d v 

represents the area of the projection of ABQP on the ys-plane. 

Although this geometrical interpretation of line integrals as 
areas is vivid and easily grasped, it obscures the fact that almost 
invariably in applications the function F(x,y) describes some 
physical property of the plane of integration and is actually un- 
related to any other region of space. 

EXAMPLE 2 

If a particle is attracted toward the origin by a force whose magnitude is proportional to the 
distance r of the particle from the origin, how much work is done when the particle is moved 
from the point (0,1) to the point (1,2) along the path y = 1 + a: 2 , assuming a coefficient of 
friction n between the particle and the path? 

Let 0 be the angle which the tangent to the curve at a general point P:{x,y) makes with the 
x-axis; let $ be the angle which the radius vector to P makes with the x-axis ; and let a be the 
angle between the tangent and the radius vector at P (Fig. 12.19). In moving the particle an 
infinitesimal distance As along the path, work must be done against two forces, namely, the 
tangential component of the central force 
F t - F cos a = hr cos a 
and the frictional force 

Ff = fiF n = ixF sin a. = ukr sin a 

arising from the component of the central force which is normal to the path and which acts to 
press the particle against the path. The infinitesimal amount of work done against these forces 



564 


VECTOR ANALYSIS 


CHAP. 12 


in moving a distance As is approximately 

AW = F t As + F/ As = (At cos a + ykr sin a) As 

Now, from the exterior-angle theorem of plana geometry, a =* <£ — 6. Hence, 

fl,2 fl,2 

W = k / r cos (<t> — 0) ds + pk I ^ r sra (4> — 0 ) ds 

= k f ^ r (cos 4> cos 0 + sin </> sin 0) ds + pk jf ^ r(sin 4> cos 0 — cos </> sin 6) ds 
But r cos <f> = x r sin tj> = y 

and cos 6 ds = dx sin 0 ds = dy 

Hence, substituting these into the last expression for W, we have 

W = k (xdx +ydy) + pk jf ^ {y dx - x dy) 

The first of these integrals can be written very simply as 
k /%2 

ill + * 

which, independent of the path, is just 

The second integral is not an exact differential, and thus, as usual, due account must be 
taken of the path. Now, along the path, y — x 2 -f 1 and x = y/ y — 1. Hence, 



The total amount of work done in the course of the motion is therefore 



The first term represents recoverable work stored as potential energy ; the second term represents 
irrecoverable work dissipated as heat through friction. 

The extension of line integration to paths in three dimensions 
is easily accomplished. Let F{z,y,z) be a continuous function of 
x, y, and z, and let C be a continuous, sectionally smooth curve 
joining the points A and B, Furthermore, let the arc of C between 
A and B be divided in an arbitrary manner into n subintervals 
As, whose projections on the coordinate axes are Arc,-, At/;, and A z,, 
and let an arbitrary point P; :(&,»?,•, ft) be chosen in each As,. 
We now evaluate F(x,y,z ) at each of the points P; and form the 
sums 

2 Ax; 2 *Vi 2 AZi 2 ASi 


SEC. 12.4 


UNE, surface, and volume integrals 


565 


The limits of these sums as n becomes infinite in such a way that 
the length of each As, approaches zero define the respective line 
integrals: 

fc F(x,y,z) dx f c F(x,y,z) dy J c F(x,y,z ) dz f F(x,y,z ) ds 

Because of the difficulty of defining a space curve C as the 
intersection of several surfaces, it is customary to use a parametric 
representation for C. Hence, line integrals in three dimensions are 
ordinarily evaluated by integrating in terms of the parameter on 
C after the variables in the integrand have been replaced by their 
expressions in terms of the parameter. 

EXAMPLE 3 

What is J G (xy + z 2 ) ds, where C is the arc of the helix 
x — cos t y — sin t z — t 
which joins the points (1,0,0) and ( — 1,0,*-)? 

Since (ds) 2 ~ (dx) 2 + (dy) 2 + (dz) 2 

and since dx — — sin t dt, dy = cos t dt, and dz = dt, we have at once 
ds — Vsin 2 1 + cos 2 1 + 1 |d/| = \/2 \dt\ 

Furthermore, it is clear that the point (1,0,0) corresponds to the parametric value t — 0 and 
that the point ( — 1,0 ,tt) corresponds to the parametric value t = tt. Hence, expressing the 
integrand in terms of the parameter t, the required integral becomes 

f (cos taint + t 2 ) s/2 dt = \/2 j^~-^ + | -~ 

The concept of a line integral generalizes at once to surface 
and volume integrals. To describe the former, let F(x,y,z) be a con- 
tinuous function of x, y, and z, and let S be a given regular* surface 
or portion of a regular surface in the region of definition of F(x,y,z ) . 
Let S be subdivided in an arbitrary manner into n elements A Si 
(Fig. 12.20), and in each element let an arbitrary point ft) 

be chosen. Finally, let F(x,y,z) be evaluated at each of the points 
Pi. Then the limit of the sum 

^ F( ft) A Si 


* A surface is said to be smooth if at each of its points there exists a tangent 
plane which varies continuously as the point varies continuously on the 
surface. A smooth surface is said to be orientable if it is two-sided; that is, 
if it is possible at each point to identify consistently a unique direction nor- 
mal to the surface. A surface which can be subdivided by a finite number of 
sectionally smooth curves into pieces each of which is orientable (and there- 
fore smooth) is said to be regular. For a discussion of smooth surfaces which 
are not regular, that is, smooth one-sided surfaces, see, for instance, Richard 
Courant and Herbert Robbins, “What Is Mathematics?”, pp. 259-264, 
Oxford Book Company, Inc., New York, 1951. 



as n becomes infinite in such a way that not only the area of each 
A Si but also its maximum chord approaches zero, is the surface 
integral 

f j s F(x,y, a) dS 

Similarly, given a function F(x,y,z) and a region of space V, 
we can subdivide V into arbitrary subregions A Vi, then evaluate 
F(x,y,z ) at an arbitrary point in each AY, and form the 

sum 


j Ffe-tef.) AF, 


The limit of this sum as n becomes infinite in such a way that not 
only the volume of each AF< but also its maximum chord ap- 
proaches zero, is the volume integral 


fffy F(x,y,z) dV 


EXAMPLE 4 

What is the integral of the function x 2 z over the entire surface of the right circular cylinder of 
height h which stands on the circle a: 2 + y 2 — a 2 ? What is the integral of the given function 
throughout the volume of the cylinder? 

To answer the first question, we must perform three integrations; i.e., we must integrate 
separately over the curved surface, the lower baseband the upper base of the cylinder. In each 
case, of course, we must employ a subdivision of the appropriate portion of the surface which will 
lead to integrals that can conveniently be evaluated.- This is most easily done by using polar 
coordinates, as shown in Fig. 12.21. Then, on the curved surface, say Si, we have 
dSi — add dz x = a cos 6 z ~ z 

and the integral 



On the lower base, say Sz, we have 

dSz = r dr dO x = r cos 0 3 = 0 


SEC. 12.4 


LINE, SURFACE, AND VOLUME INTEGRALS 


567 


cylindrical 

coordinates. 



dr \|/y 

' nrlfj 


However, because of the factor z , the integrand vanishes identically on Si, and without further 
calculations we have 


On the upper base, say S 3 , we have dS 3 = r dr d 


:os 6, and z — h. Hence, 


Us, d3 ' - io' fo (r QOS ° mr *<*>-* f*’ 9 [r] 0 dl 

_ aSh f 0 sin 20 ~\ 2r _ ira*h 

- T [2 + 4 Jo “ ~ 

The integral over the entire surface <S is, of course, the sum of the integrals over Si, Si, and 


In computing the required volume integral it is also convenient to use polar coordinates. 

Doing this, we have dV = r dr de dz, x — r cos 0, z — z, and the integral 

Jjjy x ' z dV ~ J 0 Jq Jq (r cos 0) 2 z(r dr do dz) J Q z cos 2 d dB dz 

ir a 4 fh ira 4 A 2 

-Tio ***- — 

For the most part, our interest in line, surface, and volume 
integrals will be theoretical rather than computational; that is, we 
shall use them far more often in derivations than in numerical 
calculation. Fundamental among the theorems we will need for 
this purpose is Green’s lemma,* which relates the line integral of 
a function taken around the boundary of a plane region to the 


Named for the English mathematical physicist George Green (1793-1841) . 


568 


VECTOR ANALYSIS 


CHAP. 12 


surface integral of an associated function taken over the region 
itself: 


THEOREM 1 

If R is a plane region bounded by a finite number of simple closed curves* and 
if U(x,y), V(x,y), and ~ are continuous at all points of R and its boundary C, 
then 

£J7* + V* -/£(£- 52)** 

PROOF Let us first suppose that the boundary of R is a single simple closed 
curve C with the property that any line parallel to either of the coordinate axes 
cuts it in at most two points, and let us draw the horizontal and vertical lines 
which circumscribe C (Fig. 12.22). Then the arcs P4P1P2 and P4P3P2 define single- 



valued functions of x, which we shall call fi(x) and fi(x), respectively. Similarly, 
the arcs P1P4P3 and P1P2P3 define single-valued functions of y, which we shall 
call gi(y) and gi(y), respectively. Now consider 


-//.£** 

To carry out this integration over R, it is sufficient to integrate with respect to x 
from the arc P1P4P3 to the arc P1P2P3 and then to integrate with respect to y 
from c to d. Hence, 

T fd fg«v) dV . , , 

l L, eZ dxdy 

The inner integration can easily be performed, and we find 

h = jf V(x,y) dy = Jf V[g 2 (y),y] dy - J* V[g x {y),y] dy 
= f*_ V[gz(y),y} dy + j‘ V[ gi (y),y] dy 

* For our purposes it is sufficient to define a simple closed curve as a closed, 
sectionally smooth curve which does not cross itself. That this is not the 
whole story, however, can be inferred from the article “What Is a Curve?” 
by G. T. Whyburn in the Am. Math. Monthly, vol. 49, pp. 493-497, October, 
1942. 



SEC. 12.4 


LINE, SURFACE, AND VOLUME INTEGRALS 


569 


Now, the first of these integrals is precisely the line integral 

(d 

J c V(x,y) dy 

taken along the path x = g 2 (y) from P x to P 3 , and the second is just the same line 
integral taken along the path x — g x (y) in the direction from P 3) through P 4 , to P x . 
Together, then, they constitute the line integral of V(x,y) around the entire closed 
curve C ; hence, 

(1) fIaS dxdy = fc V U,y) d V 
Similarly, if we consider 

h ~ ff*^ dxdy= ffxl% dydx 

we can write more specifically 

fb fMx) d U . . 

h = 1. L> l¥ dydx 

Performing the inner integration, we have 

h = Ja U ( X >yy \Z dx “ fa dx - Jf U[x,fi(x)] dx 

- - f“ U[x,ft(x)] dx — U[x,fi(x)] dx 

The first of these integrals is just the negative of the line integral of U(x,y) along 
y = fi(x) in the direction from P% to P 4 . The second is the negative of the integral 
of U(x,y ) along y — f x (x) from P 4 to P 2 . Together they constitute the negative of 
the line integral of U(x,y) entirely around C in the same direction in which we 
integrated in (1): 

(2) ff R -jfjj dxdy — — J c U(x,y) dx 

If we subtract (2) from (1) and combine the integrals on each side, we obtain 

( 3 ) futa + v*,- 



570 


VECTOR ANALYSIS 


CHAP. 12 


which establishes Green’s lemma for the special regions we have thus far been 
considering. 

It is a simple matter, now, to extend Green’s lemma to regions whose bound- 
aries do not satisfy the condition that every line parallel to either of the coordinate 
axes cuts them in at most two points. For, if this is not the case, the region R can 
be divided into subregions Ri whose boundaries Ci do have this property. Then 
Eq. (3) can be applied to each subregion, following which the addition of these 
results yields Green’s lemma for the general region R itself. For instance, for the 
region shown in Fig. 12.23 we can subdivide as indicated and then apply Eq. (3) to 
each subregion, getting 

£ IT* + F* -/£(£- 

L Udx + Vdy = II«M - f ) tadv 

{ c Udx + Vdv= ff-f)** 

When these results are added, the four integrals on the right combine to give 
exactly 

fl$ f-f)** 

since Ri + R z + Rz + Ri = R. Moreover, the four line integrals on the left 
combine to give the line integral around the two curves which form the boundary 
of R plus a set of line integrals taken along the auxiliary boundary arcs PiQi. 
Since U and V are continuous throughout R, these integrals cancel in pairs, 
however, since each of the segments PiQi is traversed twice in opposite directions. 
Henee we are left with 

f'Vdx+Vdy- 

which is the assertion of Green’s lemma. 

The direction in which it is necessary to integrate around 
C, in order for Green’s lemma to be correct as we have stated it, 
is characterized by the fact that an observer moving along C in 
this direction always has the interior of the region R on his left. 
This direction is called the positive direction of traversing C. 

EXERCISES 

1 Discuss the extension of Green’s lemma to regions whose boundaries contain segments 
which are parallel to one or the other of the coordinate axes. 

r 2,3 

2 Evaluate / i (2 xy — 1) dx + (x i + 1) dy along the paths y ~ x + 1 and y — (x 2 /2) + 1. 

„3 Evaluate Jx 2 y 2 ds around the circle x 2 + y 2 — 1. (Hint : Use polar coordinates.) 

r l,o 

4 Along what curve of the family y — fcr(l — x) does the integral L Q y(x — y) dx attain its 
largest value? 



SEC. 12.4 


LINE, SURFACE, AND VOLUME INTEGRALS 


/ 1.0 

x 3/(1 + x) dy (a) along the x-axis and (b) along y = 1 — x 2 . 

6 Evaluate xds along the paths y — x, y = x?-, and y = x 2 . 

7 Evaluate (x + y)z dS, where S is the surface of the cube whose vertices are (0,0,0), 

(1,0,0), (1,1,0), (0,1,0), (0,0,1), (1,0,1), (1,1,1), and (0,1,1). 

8 Evaluate jj s (x + y + z ) dS, where S is the portion of the surface of the sphere x 2 + 
y- + 2 2 = n 2 which lies in the first octant. (Hint: Use spherical coordinates.) 

8 Evaluate jJJ v X “Z dV, where V is the volume under the surface x~ + y- z- = a 2 and 
above the xy-plane. 

10 Verify Green’s lemma for the integral / (x 2 + y) dx — xy 2 dy, taken around the boundary of 
the square whose vertices are (0,0), (1,0), (1,1), and (0,1). 

11 Verify Green’s lemma for the integral f(x — y) dx + (x + y) dy taken around the boundary 
of the finite area in the first quadrant between the curves y = x 2 and y 2 = x. 

12 Verify Green’s lemma for the integral f(x — 2 y) dx + xdy taken around the circle x 2 + 
y 2 - a 2 . 

13 If a particle is attracted toward the origin by a force proportional to the nth power of the 
distance from the origin, show that the work done against this force in moving the particle 
from the point (x 0 ,?/o) to the point (xi,yi) is independent of the path, and find its amount. 
14 A particle is attracted toward the origin by a force proportional to the cube of the distance 
from the origin. How much work is done in moving the particle from the origin to the point 
(1,1) if motion takes place (a) along the path y = x, (b) along the path y = x 2 , (c) along 
the x-axis to (1,0) and then vertically to (1,1), and (d) along the y-axis to (0,1) and then 
horizontally to (1,1), and if in each case the coefficient of friction between the particle and 
the path is n ? 

16 If U, V. —i and — are continuous and if — = — at all points in the interior of a simple 
dy dx dy dx 

closed curve C, show that U dx + V dy — 0 for any pimple closed curve r which lies 
entirely within C. 

18 Show that Green’s lemma fails to hold for the functions 


U = - 


x 2 + y 2 


and 


x 2 + y 


if R is the interior of the circle C: x- + y 2 — 1. Explain. 

17 Using Green’s lemma, show that the area bounded by any simple closed curve C is given by 
the formula A = \<z |,i dy — y dx. Is this formula correct for regions bounded by more 
than one simple closed curve? 

18 Using Green’s lemma, establish the formula 


ff ( d 2 F d 2 F\ f dF , 


where R is the region bounded by the simple closed curve C, and - 

derivative of F in the direction of the outer normal to C. 

By setting U = / — and V = / — in Green’s lemma, show that 
dx dy 

[[ ( fd9 

]h\axay ay ax) k 1 


- is the directional 


19 


572 


VECTOR ANALYSIS 


CHAP. 12 


20 


where R is the region bounded by the simple closed curve C. What is j (! g dp 

By setting U = / — and V ~ — / — in Green’s lemma, show that 
dy dx 




where R is the region bounded by the simple closed curve C, and ^ is the directional deriv- 
ative of g in the direction of the outer normal to C. 


12.5 

Integral theorems 

The integrals we encounter in vector analysis are in most cases 
scalar quantities. For instance, given a vector function F (x,y,z), 
we are often interested in the integral of its tangential component 
along a curve C or in the integral of its normal component over a 
surface S. In the first case, if R is the vector from the origin to a 
general point of C, so that dR/ds = T is the unit vector tangent to 
to C at a general point, then F • T is the tangential component of 
F and 

/c F ' Tds “/c F -f* 

(1) - / c F-dR 

is the integral of this component along the curve C. In the second 
case, if N is the unit vector normal to S at a general point, then 
F * N is the normal component of F and 

(2) /^F-NdSf 

is the integral of this component over the surface S. Other scalar 
integrals of frequent occurrence are the surface integral of the 
normal component of the curl of F : 

(3) fj 8 (V x F) • N dS 

and the volume integral of the divergence of F: 

(4) fJJv V * F dV 

Fundamental in many of the applications of vector analysis 
is the so-called divergence theorem, which asserts the equality of 
the integrals (2) and (4) when V is the volume bounded by the 
closed regular surface S: 


f Some writers denote the differential vector N dS by the symbol dS or dA. 


SEC. 12.5 


INTEGRAL THEOREMS 


573 


THEOREM 1 

If F(a :,y,z) and V • F are continuous over the closed regular surface S and its 
interior V and if N is the unit vector perpendicular to S at a general point and 
extending outward from S, then 

{J s KSiS= f[f v V-VdV 

PROOF To prove this theorem, we shall first suppose that S is a closed surface 
such that no line parallel to one of the coordinate axes cuts it in more than two 
points. Now, if F = ui + iyj -f- wk, the assertion of the theorem can be written 
at length in the form 

ff sN .^ +vi+wk)d s = f[f r (^ + ^yv 


( 5 ) 


Jjf N • iu dS + Jf s TX-jvdS+ ff s N • kw dS 




' dx JJJv dy 

We shall establish (5) by proving that respective integrals on each side are equal. 
To do this, let us consider first the integral 

rrr , 


Jv dz 


dV 


Under our assumption that no line parallel to one of the coordinate axes meets S 
in more than two points, it follows, in particular, that S is a double-valued surface 
over its projection on the ay-plane and, hence, can be thought of as consisting of a 
lower half, say >Si, and an upper half, say Si. Then, if we take dV — dx dy dz and 
perform the 2 -integration first, we have 


( 6 ) 


n Ss — dz dx dy — JJ (yv 


) dx dy 


dz JJ |on Si |on Si/ 

where, of course, x and y range over the area in the xy - plane which is the projection 
of S. Moreover, the elements dSi and dSi can be defined so that they have dx dy 
as their common projection on the a^-plane (Fig. 12.24). Now k • Ni and k • N 2 


FIGURE 12.24 
Integration in 
the 2 -direetion 
from Si to Si in 
the proof of the 
divergence 
theorem . 



VECTOR ANALYSIS 


CHAP. 12 



are, respectively, the cosines of the angles between the normal to the a:?/-plane k 
and the outer normals to dSi and dSt; that is, they are numerically the cosines of 
the angles through which dSi and dSt are projected onto the element dec dy. Hence, 

dxdy = -k • Ni dSi 

= k*N t dS t 

where the minus sign is necessary in the first equality because the outer normal 
IT i to dSi makes an angle of more than 90° with the direction of k and thus k ■ N x 
is negative, whereas both dx dy and dSi are clearly positive. Therefore, substituting 
for dx dy in the right-hand side of (6), that is, transferring the integration from the 
common projection of Si and St back onto and St themselves, we have 

Hfv TS dr = // w L * ** ' dy - fl w |.» * * dy 

~ Jf w\ onS3 k-TX*dS 2 + ff v>\ mSt k.NidSi 

« JJ S} wk • N dS -f JJ Si wk dS 

where the subscripts have been dropped from the integrands as superfluous, since 
the ranges of integration are now explicitly indicated. Finally, since Si and S z 
together make up the entire closed surface S, we can combine the last two integrals, 
getting 

Similarly we can show that 

MlS" 

fJfvTy dV = fJs»-' SdS 

Adding the last three equations, we obtain the expanded form (5) of the divergence 
theorem, under the assumption that S is exactly two-valued over its projections 
on each of the coordinate planes. 

On the other hand, if S does not have this property, we can always subdivide 
its interior V into regions F; whose boundaries Si do have this property. Then, 
applying our limited result to each of these regions, we obtain a set of equations of 
the form 

Ik*-** 8 - Iffy,*-*™ 

If these are added, the sum of the volume integrals is, of course, just the integral 
of V * F throughout the entire volume F. The sum of the surface integrals is equal 
to the integral of N * F over the original surface 5 plus a set of integrals over the 
auxiliary boundary surfaces which were introduced when V was subdivided. These 
cancel in pairs, however, since the integration extends twice over each interface, 
with integrands which are identical except for the oppositely directed unit normals 
they contain as factors. Thus, our proof can be extended to volumes bounded by 
general closed regular surfaces, and Theorem 1 is established. 




SEC. 12.5 


INTEGRAL THEOREMS 


575 


EXAMPLE 1 

Prove that Jj s N X F dS = J jj y V X F dV. 

To show this, let us apply the divergence theorem to the vector F X C, where C is an 
arbitrary constant vector. Then 


jJ s N-(F X C)dS = fff v V-{VXC)dV 


Now taking advantage of the fact that C is a constant vector and that a cyclic permutation of 
the elements of a scalar triple product leaves the product unchanged, we can write 

J^,C-NXFAS = fff v C‘V XXdV 

or, removing the constant vector C from each integral, 

C- N X F dS = C • fjjy V X F dF 

Since C is an arbitrary vector, this equation asserts that the vectors 

JJ s ^XFdS and X F dV 

have equal projections in all directions and, hence, must be equal to each other, as asserted. 

Various important theorems stem from the divergence the- 
orem. For instance, if u and v are two sufficiently differentiable 
scalar point- functions and if we set 
F — u Vo 


(7) 

(8) 


then, by Eq. (13), Sec. 12.3, 

V • F = V • (u Vv) = uV ■ Vv + Vu • Vv = Vu ■ Vv -f uV 2 v 
Hence, applying the divergence theorem to the vector F = u Vv, 
we have 

fff v C Vu • Vv + uV 2 v) dV = JJ g N • u Vv dS \/ 

Similarly, if we interchange the roles of u and v in (7), we obtain 

fffv ( Vv * Vu + l ’ v2 “) dV = ffs N ‘ v Vu dS ^ 

Finally, if we subtract (8) from (7), we obtain what is known as 
Green’s theorem:* 


TH EOREM 2 

If V is the volume bounded by a closed regular surface S and if u(x,y,z ) and v(x,y,z ) 
are scalar functions possessing continuous second partial derivatives, then 

fff v («V 2 r - vV 2 u) dV = ff s H • (u Vv - » Vu) dS 

Another result of some importance can be obtained by apply- 
ing the divergence theorem to the function F — R/r 3 , where, as 
usual, 

R = si + yj + zk and r = |R| = y/x 2 + y 2 -f 2 2 


This should not be confused with Green’s lemma, Theorem 1, Sec. 12.4. 


5 76 


VECTOR ANALYSIS 


CHAP. 12 


Thus, substituting into the divergence theorem, we have 


(9) 


( 10 ) 


\dS : 


'//j» 


FIGURE 12.25 
A singular point 
excluded from a 
three-dimen- 
sional region by 
an auxiliary 
spherical 
boundary. 


//s N - 

Now, by Eq. (13), Sec. 12.3, and Exercise 13, Sec. 12.3, 

y.^ = IvR + R.vi=4 + R * ~ Vr 
r 3 r 3 r 3 r 3 dr 

_ »_, L £_ 0 

r s ^ r* r / r 3 r 5 

Hence, we conclude from (9) that 

provided, of course, that r is different from zero at all points on 
and within S; that is, provided the origin from which R is drawn 
does not lie on S or within the volume enclosed by the surface S. 

Sinoe the divergence theorem requires that the function to 
which it is applied have continuous first partial derivatives 
throughout the volume of integration, it cannot be applied to 
R/r 3 if the origin of R is within S. In this case we, therefore, 
modify the region of integration by constructing a sphere S' of 
radius e having the origin 0 as center (Fig. 12.25). In the region 






V* * ■ s 


x 


(ii) 


V between S and S' the function R/r 3 satisfies the conditions of 
the divergence theorem, and thus Eq. (10) can properly be 
applied, giving 


SL 


N -~dS = 0 




Now, at any point of S', the direction of the normal which extends 
outward from the volume V' is opposite to R. Hence, the unit 
outer normal to S ' is N = — R/e, since on S' the length of the 
radius vector R is r = e. Therefore, in the last integral, 


N-R = 


R 


SEC. 12.5 


INTEGRAL THEOREMS 


S77 


and Eq. (11) becomes 

/i»+// s ,^dS-0 

or 

This result, coupled with Eq. (10), gives us Gauss’ theorem: 


THEOREM 3 

If S is a closed regular surface, then 


If.*- 


\dS = 


[0 

I 4 7T 


0 outside S 
0 inside S 


Another integral formula of great importance in vector 
lis is Stokes’ theorem:* 


THEOREM 4 

If S is the portion of a regular surface bounded by the closed curve C and if 
"£{x,y,z) is a vector function possessing continuous first partial derivatives, then 

/ c F-dR- // s N-VxFdS 

provided the direction of integration around C is positive with respect to the side 
of S on which the unit normals are drawn. 

PROOF To prove this, we suppose first that S has the property that it is 
single-valued above its projections on each of the coordinate planes. Now, if we 
write F — ui + v] + wk, Stokes’ theorem becomes 

J c udx +vdy + wdz = Jf s N • V x (m + v j + wk ) dS 

(12) = // s N-VxuidS+ Jf s W-VXvidS 

+ ff s ® -V xwkdS 

and, to establish it, it is sufficient to show that respective integrals on the two 
sides of the last equation are equal. We consider first the integral 

ff s N . V x ui dS 

taken over the closed surface consisting of S, its projection on the xy-plane, say 
S', and the cylindrical surface, say S ", which projects S into S' (Fig. 12.26a). If 
we apply the divergence theorem to the vector V x ui over this surface and the 
volume it encloses, we obtain 

jjf N • V x uidS + fj s , N • V XuidS + ff s „ N • V X ui dS 

« /// F V. (V'x«i) dV = 0 


( 13 ) 


fJ s V • V x uidS = — // s , N • V x uidS — ff s „ N • V X ui dS 


* Named for the English mathematical physicist G. G. Stokes (1819 1903). 


VECTOR ANALYSIS 


CHAP. 12 



578 




lj 

x N , 

y\d = cos i 


\ 


FIGURE 12.26 

The closed surface S -f S' + S" employed in the proof of Stokes’ theorem. 


since, as we showed in Sec. 12.3, the divergence of the curl of any vector is iden- 
tically zero. Now, 




i j k 
d_ d_ 

dx dy dz 


. du . du 


u 0 0 


Moreover, on S' the outer normal N is clearly equal to — k. Hence, on S' we have 
■m- _ , . . /.du . du\ du 

~ k \’Tz- k iry)-Ty 

md f{ s ,*.V X vidS= Jf s .f y dS 

If we now apply Green’s lemma (Theorem 1, Sec. 12.4) to the last integral, we find 
(14) fj s/ N . v x ui dS = - f cl u dx 

Furthermore, since S" is a cylindrical surface whose generators are parallel to the 
2 -axis, the normals to S" are all perpendicular to the vector k. Therefore, on S" 
we have 

W rr • -RT ( • du 1 & U \ • dw 

N.Vxu.-N.^-k^j -N., s 

where, clearly, N * j is independent of z. Then, taking dS — dz ds (Fig. 12.26a), 
we have 

(15> “ / 0 .(“| S - M | S .) W ' i * 

Now N • j is equal to the cosine of the angle between the normal N and the posi- 
tive y-axis, and this is numerically equal but opposite in sign to the cosine of the 
angle between the directed tangent to C' and the positive x-axis (Fig. 12.266). 





SEC. 12.5 


INTEGRAL THEOREMS 


579 


Hence, N * j ds - —dx, and Eq. (15) becomes 

(16) ff s „K-VXuidS = - f c ,u\ s dx+ J c ,u\ s ,dx 

Now, in the first integral on the right in (16), the integrand, being evaluated at 
those points of S which are directly above the curve C", is actually evaluated 
along the curve C. Moreover, because C is the projection of C in the ^-direction, 
the variation of x around C' is exactly the same as the variation of x around C. 
Hence, in this integral we can properly replace the indicated path of integration 
C' by the curve C, getting 

(17) fj s „ N.V X uidS = — j c u dx + Jg, u dx 
Therefore, substituting from (14) and (17) into (13), we have 

// s N • V x ui dS = — ( - J c , u dx) — ( — J c udx + dx) 

(18) = J c u dx 

In precisely the same way we can show that 
U9) // S N- = f c vdy 

(20) JJ g TS • V XwkdS — j c w ^ z 

Finally by adding (18), (19), and' (20) w*e obtain Eq. (12). 

It is now a simple matter to extend Eq. (12) to surfaces S which are not 
single-valued above their projections on the coordinate planes. For, if this is not 
the case, we can always subdivide S into regions Si which do have this property 
and then apply Eq. (12) to each Si and its boundary C», getting the set of equations 

J Ci F • dR = J/ Si N.VxFd8 

/ Cn F-dR = V xFd£ 

When these are added, the surface integrals combine to give precisely the surface 
integral over S itself, since Si + * • • + S n = S. At the same time the line 
integrals combine to give the line integral around the actual boundary of S plus 
the line integral along all the auxiliary boundary arcs taken twice in opposite 
directions (Fig. 12.27). Since the latter cancel identically, the line integral around 
C itself is all that, remains, and the theorem follows in the general case. 

If A and B are two arbitrary points in space, it is often 
important to know whether the line integral 

(21) F • dR 

is independent of the path which joins A and B. As a first step in 
establishing criteria for this, we observe that, if the integral (21) 
is independent of the path, then 

jF-dR 

taken around any closed path is zero. For let, C be- any simple 


580 


VECTOR ANALYSIS 


CHAP. 12 



curve, and let A and B be any two points on C (Fig. 12.28). 
Then, since the integral is independent of the path, by hypothesis, 
we have 

Now, if we reverse the direction of integration in the integral on 
the right, we have 


FIGURE 12.27 
A portion of a 
surface S sub- 
divided into 
simpler regions 
Si, Si, . . . 


or, transposing, 

F • dR -f J B ^ F • dR = J c F • dR - 0 as asserted. 

Conversely, if JF • dR is zero around every closed curve in a 
region, then the integral (21) is independent of the path. For if 
APB and AQB are any two paths joining A and B (Fig. 12.28), 
we have, by hypothesis, 

whence, by reversing the direction of integration along BQA and 
transposing, 


as asserted. 


Now if the integral (21) is independent of the path, then 
when we integrate from a fixed point Po:(xo,y 0 ,2o) to a variable 


FIGURE 12.23 
Two paths from 
A to £ forming 
a simple closed 
curve. 



SEC. 12.5 


INTEGRAL THEOREMS 


581 


3 <£ 

dx 


point P:(x,y,z), the result is a function only of the coordinates 
x, y, z of the variable end point. That is, if F = id + v j -f wk, 
we can appropriately write 

F ■ dR = j pa u dx + v dy + w ds = $(x,y,z) 

In what follows it will be necessary to know the partial derivatives 
of the function 4> defined by the last equation. To obtain these, it 
is convenient to go back to the fundamental definition of a 
derivative and write, for the x-partial derivative, for instance, 

iim + v,z ^ ~~ 

Aa?—>0 Ax 


lim ( ( x+ 

Ax-t-0 AX \ Jxo.W 


’* u dx -f v dy + w dz — I* udx + » dy 4* to dz\ 

Jxa.m.za / 


Since by hypothesis these integrals are independent of the path, 
we can use any paths we find convenient. In particular, in the 
integral from (x Q ,y 0 ,z 0 ) to (x + Ax, y, z), we shall let the path of 
integration consist of any smooth curve joining (x 0 ,yo,2a) to (x,y,z) 
plus the segment of the straight line joining ( x,y,z ) to (x + Ax, y, z) 
(Fig. 12.29). Then, 



= lim ~ f x+Ax ' y,z u dx 4- v dy + w dz 

Ax-*0 AX Jx,y,z 


■ Jp**** udx +v dy + w dzj 


Now, along the path of integration in the last integral, we have 

dy ss 0 and dz = 0 

Hence, ~ = lim ~ f x+Ax u y x 
dx Ax-*0 Ax Jx 

Since u is assumed to be continuous, the law of the mean for 
integrals can be applied to the last expression, and we have 

~ - lim — [u(x + 0 Ax, y, z) Ax] 0 < 0 < 1 

OX Ax~*0 “X 

= u(x,y,z) 


582 


VECTOR ANALYSIS 


CHAP. 12 


In the same way the partial derivatives with respect to y and z 
can be determined, and we have the following theorem: 


THEOREM 5 

If F — in + vj + wk is a continuous function of x, y, and z with the property that 
JF ■ dR = $udx + v dy -\- w dz 

is independent of the path, then the partial derivatives of the function 
4>(x,y,z) s F • dR = J*udx + vdy+w dz 


are 



8 $ 

dz 


w 


We are now in a position to show that, if F — ui + v] wk 
is a continuous vector function and that, if J F * dR is independent 
of the path, then F is the gradient of some scalar function <j>. In 
fact, if we define 


<j>(x,y,z) - F • dR = Jp a u dx + v dy + w a 


we have, by Theorem 5, 

v *"l5 i + ^ i + lf k = “ i + " i + “' k = F 


Before we can state a correct converse of the last result, we 
must distinguish- between two types of regions in space. On the 
one hand, a region V may have the property that every simple 
closed curve within it can be continuously contracted into a point 
without at any stage having to leave the region. Regions of this 
type are called simply connected ; as ' examples we have the 
interior of a sphere, the exterior of a sphere, and the space between 
two concentric spheres. On the other hand, a region V may con- 
tain simple closed curves which cannot be continuously con- 
tracted into a point without at some stage having to leave the 
region. Such regions are called multiply connected ; as an example 
we have the space between two infinitely long, coaxial cylinders, 
within which it is clearly impossible to shrink into a single point 
any closed curve encircling the inner cylindrical boundary. Both 
the interior and the exterior of a torus are also examples of 
multiply connected regions.* 

Now suppose that, throughout a simply connected region V, 
the vector function F is the gradient of a scalar function <f>. Then 


fB rB fB 

Ja s Ja F0 • dR = d<t> = 4> 


B 

A 


and thus the integral of F depends only on the coordinates of the 
end points A and B and not on the path which joins them. It is 


* The distinction between simply connected and multiply connected regions 
applies equally well in the plane, of course, and in our study of functions of 
a complex variable it will often be an important consideration. 


SEC. 12.5 


INTEGRAL THEOREMS 


583 


easy to show by an example (Exercise 30) that this is not neces- 
sarily true for multiply connected regions, since in such cases cp 
need not be continuous and single-valued throughout the region. 

Finally, we observe that, if the curl of F is identically zero 
throughout a simply connected region V, then JF • dR is inde- 
pendent of the path, and conversely. For, if C is an arbitrary 
closed curve in a simply connected region V, it can be spanned 
by a surface S also lying entirely in V. Then, by Stokes’ theorem, 
we have 

/ c F-dR = // s N.VxFdS 
and, if V x F = 0, it follows that 

/«*•<* R = ° 

But, by one of our earlier observations, if JF ■ dR is zero around 
every closed curve, then it is independent of the path, as asserted. 
On the other hand, if /F • dR is independent of the path, then, as 
we showed above, F is the gradient of a certain scalar function cj>. 
But then V x F = V x V<£, and this is identically zero, by Eq. 
(18), Sec. 12.3. 

The results of the preceding discussion can now be sum- 
marized in the following theorem: 

THEOREM 6 

If F = ui + vj + ivk is a function of x, y, and z possessing continuous first partial 
derivatives at all points of a simply connected region V, then the following state- 
ments are all equivalent; that is, any one of them implies each of the others: 

a JF • dR S3 fu dx + v dy + w dz is independent of the path, 

b JF • dR as ju dx + v dy + w dz is zero around every closed 

curve. 

c F • dR = u dx + v dy + w dz is an exact differential, 
d F is the gradient of the scalar point function 

<p(x,y,z) = ^ F • dR ss u dx + v dy + w dz 

e The curl of F vanishes identically. 

EXERCISES 

n.i.r 

1 If F = 2 yi + a;j + 2 2 k, evaluate / ' F • dR along 

a The rectilinear path from (0,0,0) to (1,0,0) to (1,1,0) to (1,1,1) 

fa The rectilinear path from (0,0,0) to (1,1,0) to (1,1,1) 

c The straight line joining (0,0,0) to (1,1,1) 
d The curve x 2 + y 2 = 2z, x = y 

2 If F = xi + yj + 2k, evaluate JJ g F • N dS over 

a The surface of the cube whose vertices are (0,0,0), (1,0,0), (1,1,0), (0,1,0), (0,0,1), (1,0,1), 

( 1 , 1 , 1 ), ( 0 , 1 , 1 ) 

fa The portion of the plane x + 2 y + Zz = 6 which lies in the first octant 


VECTOR ANALYSIS 


CHAP. 12 


584 


3 


4 

6 

6 

7 

8 
9 

10 

11 

12 

13 

14 

16 

16 

17 

18 

19 

20 
21 
22 
23 


24 

26 


c The entire surface of the sphere x 2 + y 2 + z 2 — 1 
d The portion of the cone x 2 + y 2 — (1 — z) 2 = 0 between the planes z 

If F = yi + xj + z 2 k, evaluate ^ dF throughout 


a The volume bounded by the cube whose vertices are (0,0,0), (1,0,0), (1,1,0), (0,1,0), 

( 0 , 0 , 1 ), ( 1 , 0 , 1 ), ( 1 , 1 , 1 ), ( 0 , 1 , 1 ) 

b The volume cut off from the first octant by the plane x + 2y 4~ 3z = 6 
c The upper half of the volume within the sphere x 2 + y 2 + z 2 = 1 
d The volume under the paraboloid z — 1 — x 2 — y 2 and above the plane z ~ 0 
Write the divergence theorem in cartesian form. 

Write Green’s theorem in cartesian form. 

Write Gauss’ theorem in cartesian form. 

Write Stokes’ theorem in cartesian form. 


If S is a closed surface, what ii 


y X F d£? 
•ve C, what 


is /c 


T • dR? Can Stokes’ theorem be 


ei»// s N-7, 

If T is the variable unit tangent to a 
used to evaluate this integral? 

If A Is a constant vector and C is a closed curve, show that J c A • dR =0. What is J c dR? 
If C is a closed curve, show that j c R • dR = 0. 

If C is a closed curve, show that J c (u Vv + v Vw) * dR = 0. 

If <S is a closed surface, show that N • R dS — 3F, where F is the volume enclosed 
by S. 


If S is an arbitrary closed surface and Jj s N • F dS = 0, can we conclude that F . = 0? 
Can we if S is an arbitrary open surface? 

By applying the divergence theorem to the vector M, where A is an arbitrary constant 
vector, show that jj s dS = J jj y V4> dV. What is N dS? 

By applying Stokes’ theorem to the vector <j> A, where A is an arbitrary constant vector, 
show that j c <t>dR = JJ s N XV<j> dS. 

If S is an open surface, what is jj g NXR dS? (Hint: Use the result of Exercise 16.) 

By applying Stokes’ theorem to the vector F X A, where A is an arbitrary constant vector, 


show that | c rfRXF = Jf g (N X V) X F dS. What is ^ dR X R? 

Verify the divergence theorem for the function 2xzi + yzj 4- z 2 k over the upper half of the 
sphere x 2 -f y 2 + z 2 = a 2 . 

Verify the divergence theorem for the function yi + x) 2 2 k over the cylindrical region 
bounded by x 2 + y 2 - a 2 , z = 0, and z = a. 

Verify the divergence theorem for the function £ 2 i 4- z j 4- yz k over the cube whose vertices 
are (0,0,0), (1,0,0), (1,1,0), (0,1,0), (0,0,1), (1,0,1), (1,1,1), and (0,1,1). 

Verify Stokes’ theorem for the function xyi 4- yzj 4- z 2 k over the cube described in Exercise 
21 if the face of the cube in the au/-plane is missing. 

What is the surface integral of the normal component of the curl of the vector {x 4- ?/)i 4- 
(?/ — x ) j 4- z 3 k over the upper half of the sphere x 2 4- y 2 4- z 2 = 1 ? 

If at each point of a surface S the vector R(x,y,z) is perpendicular to S, prove that the curl 
of F either vanishes identically or is everywhere tangent to S. (Hint: Apply Stokes’ theorem 
to F over the portion of <S bounded by an arbitrary closed curve on S.) 

If at each point of a closed surface S the vector F (x,y,z) is perpendicular to S, prove that 
JJJy V X F dV = 0. (Hint: Use the result of Example 1.) 


SEC. 12.6 


FURTHER APPLICATIONS 


585 


If A is an arbitrary constant vector, show that N X (A X H) dS = 2 FA, where V is 
the volume bounded by the closed surface S. (Hint: Use the result of Example 1.) 

Show that fff y ^ dV = fj s ~ dS, where ^ is the directional deriva- 

tive of <j> in the direction of the outer normal to the closed surface S which bounds the volume 
V. 

If 4>(x,y,s) is a solution of Laplace’s equation, show that 




surface S. Hence show also that j 

29 Extend Gauss’ theorem to the cas 
80 Show that although the function 


in which 0 lies on the surface S. 


is continuous and equal to the gradient of 


at all points of the region between the two cylinders 


x i _j_ y 2 — i..£ an( j x s -f- y 2 = . 4 ' ' . 

the integral JF • dR is not independent of the path in this region. [Hint: Take A to be 
fB 

( — 1,0,0) and B to be (1,0,0), and compute / F • dR along the upper and lower arcs of the 
circle x s + y 2 = 1, z = 0.] 


Further applic 0 tions 


One of the most important uses of vector analysis is in the concise 
formulation of physical laws and the derivation of other results 
from those laws. As a first example of this sort, we shall develop 
the concept of 'potential and obtain the partial differential equa- 
tion satisfied by the gravitational potential. 

To do this, let us suppose that we have a field of force of some 
kind, or, in other words, let us consider a region of space in which 
at every point a force vector F is defined. The field might, for 
instance, be gravitational, in which case ~£(x,y,z) would be the 
force acting on a unit mass at the general point P:(x,y,z) because 
of the attraction of other masses present in the region. On the 
other hand, the field might be electrostatic, in which case F(a;,y,z) 
would be the force acting on a unit charge at the general point 
P :{x,y,z ) because of the attraction or repulsion of other charges 
present in the region. Or the field might be magnetic, in which 




586 


VECTOR ANALYSIS 


CHAP. 12 


case 'B(x,y,z ) would be the force acting on a unit magnetic pole 
situated at the point P:(x, y,z). In any case, the force F experienced 
by a unit test body of the appropriate nature is called the field 
intensity. 

Now, the amount of work that must be done when a unit 
test body is moved along an arbitrary curve in the force field 
defined by a vector function F is the line integral of the tangential 
component of F ; that is, 


W = JF • dR 

If there is no dissipation of energy through friction or similar 
effects, then, according to the law of the conservation of energy, 
this integral must be zero around every closed path, and, hence, 
by Theorem 6, Sec. 12.5, it must be independent of the path 
between any given points A and B. Fields for which this is the 
case are said to be conservative. Furthermore, according to 
Theorem 6, Sec. 12.5, it is clear that in a conservative field the 
force vector F is the gradient of the scalar function 

4>(x,y,z) = £ F - dR 

The function <t> is called the potential function* of the field. In 
most problems, the masses or charges which produce F are given, 
and it is required to find F itself. Since F = V<j£>, it is clear that 
knowing <f> is equivalent to knowing F, and, hence, the determina- 
tion of <f) is of prime importance in most field problems. 

Assuming, for definiteness, that we are dealing with a 
gravitational field, let F be the field intensity at a general point 
P :(x,y,z), and let AF be the contribution to F due to the infinites- 
imal mass Ami in an infinitesimal volume AFi = AxiA?/iAsi 
enclosing the point Pi:(xi,yi,zi). According to Newton’s law of 
universal gravitation, AF is a vector whose magnitude is 


where r 2 = ( x — rri) 2 + (y — Vi)~ + (z ~ Zi ) 2 
and whose direction is opposite to that of the vector 
R = (* — aq)i -f ( y - i/i)j + (z - zi)k 

extending from Pi to P (Fig. 12.30). In other words, if units are 
so chosen that the constant in Newton's law is equal to unity, the 


fP<S 

* Many writers define the potential to be / F ■ dR, in which case F = 
— V<£. In particular, P a is often taken to be infinitely distant, so that 
</> = F • dR. 


SEC. 12.6 


FURTHER APPLICATIONS 


587 


FIGURE 12.30 
Figure used in 
calculating the 
potential at a 
point P due to 
the material in a 
volume element 
A 7,. 





( 1 ) 


( 2 ) 


(3) 


field intensity at P due to the infinitesimal mass Ami at Pi is 
AF = 


Am x R , \ A T r R 

- -3" ‘ 7 “ -pfrhV i,*0 kViZi 


where p(xi,y h zi) is the density of the material at the point Pi. 

Now let S be an arbitrary closed regular surface bounding a 
volume V, and let I denote the integral over S of the normal 
component of the force due to all the attracting material in the 
field. By definition, since F = V<£, we have 


I = Jf s N • F dS * fj s N • V<t> dS 


However, I can also be computed by first determining the part A I 
of it due to the material within AFi and then taking all the 
material in the field into account by integration. From this point 
of view we have, from (1) and (2), 

A J = ff s K. • AF dS = - ff s [pfofMi) AFi]N • - dS 


dS 


= - P (x h y h zi) A7i JJ s N • p 

since Xi, yi, Zi are constant with respect to the x,y, 2 -integration 
over S. The last integral can, of course, be evaluated by Gauss’ 
theorem (Theorem 3, Sec. 12.5). Specifically, if the origin of R, 
namely, the point Pi:(»i,2/i,«i)» is within S, the value of the 
integral is 47i-; otherwise the value of the integral is 0. Hence, 

, _ | —4.irp(xi,yi,zi) AFi AFi within S 
“ { 0 AFi outside S 

and, therefore, in computing I it is necessary to integrate only 
over the volume F bounded by S. Doing this, we find 

I - j dl = -4t r JJJ V p(x h yi,zi) dVi 
or, since x h y%, z\ are just dummy variables, 

J = -4tt fJf v p(x,y,z)dV 

Equating the two expressions (2) and (3) which we now have 
for I, we get 

ff 8 N • V* dS = -47T // [ y p(x,y,z ) dV 


588 


VECTOR ANALYSIS 


CHAP. 12 


If we now apply the divergence theorem to the integral on the 
left, we have 

///„ V • (y<P) dV - -tor // f y p(x,y,z) dV 
or ffj r [vv + torp (x,y,z)\ dV = 0 

Since this holds for any arbitrary volume V, it follows that the 
integrand must vanish identically,* and, therefore, that 

(4) v 2 <£ = — 4 irp(x,y,z) 

This is Poisson’s equation, f and we have thus shown that in 
regions occupied by matter, the gravitational potential satisfies 
Poisson’s equation. In empty space p(x,y,z ) = 0, and thus in 
empty spaed the gravitational potential satisfies Laplace’s equation 

(5) V 2 «£ - 0 

Results similar to these hold for the electrostatic and magnetic 
potentials. 

As a second example of the use of vector analysis in formulat- 
ing physical laws in mathematical terms, we shall now derive 
Maxwell’s equations% for electric and magnetic fields. To do this 
we shall have to work with the vector quantities : 

E = electric intensity 
H = magnetic intensity 
D = eE = electric flux density 
B = pH = magnetic flux density 
J = current density 

and the scalars: 

« = permittivity 
M = permeability 
a — conductivity 
Q — charge density 

q — ///,- QdV - total charge within V 

4> = Jj s W • B dS = total magnetic flux passing through S 

i == jj s N • J dS — total current flowing through S 

* Suppose that this is not the case, and let Pa be a point at which the inte- 
grand does not vanish. Then, if p(x,y,z ) and V 2 <£ are continuous (as we have 
implicitly assumed), it follows that, throughout some sufficiently small 
three-dimensional region F 0 enclosing Pa, the integrand has everywhere the 
same sign it has at Pa. Integrating over V 0 , we then obtain an integral which 
is not equal to zero, contrary to the fact that the integral has been shown 
to be zero for every volume F, 

f Named for the French mathematical physicist Simeon Denis Poisson 
(1781-1840). 

t Named for the English mathematical physicist James Clerk Maxwell 
(1831-1879). 


SSC. 12.6 


FURTHER APPLICATIONS 


589 


These quantities are connected by a number of equations expres- 
sing relations discovered experimentally in the early years of the 
nineteenth century, chiefly by Michael Faraday (1791-1867). In 
particular we have Faraday’s law, 

( 6 ) £>•<*■-£ 

which asserts that the integral of the tangential component of the 
electric intensity vector around any closed curve C is equal but 
opposite in sign to the rate of change of the magnetic flux passing 
through any surface spanning (7; Ampere’s law, 

(7) / c ,H-dR = i 

which asserts that the integral of the tangential component of the 
magnetic intensity vector around any closed curve is equal to 
the current flowing through any surface spanning C; Gauss’ law 
for electric fields, 

(8) // s N-DdS = ? 

which asserts that, the integral of the normal component of the 
electric flux density over any closed surface jS is equal to the total 
electric charge enclosed by S; and Gauss’ law for magnetic fields, 

(9) = 0 

which asserts that the total magnetic flux <f> passing through a 
closed surface is zero. 

If we now apply Stokes’ theorem to Faraday’s law (6), we 
have 

// S N.VXE®= -g 

and, substituting for <t> from its definition in terms of B, 

Since S is an arbitrary surface spanning the arbitrary closed 
curve C, the last equation can hold only if 

(JO) vxl=-f 

Similarly, by applying Stokes’ theorem to Ampere’s law (7), we 
obtain 

// s , N • V x H dS « i = jj s N • J 

and again, since S is an arbitrary open surface, we conclude that 
the vectors being integrated over S must be identical: 

(11) V X H = J 


590 


VECTOR ANALYSIS 


CHAP. 12 


Now, as Maxwell was tlie first to realize, the current density 
J consists of two parts, namely, a conduction current density 

la - crE 

due to the flow of electric charges, and a displacement current 
density 

T = _ f® 

id dt e at 

due to the time variation of the electric field. Thus, 

J-.E + 4 

and (11) becomes 

(12) VxH — crE-f-e 

Next we apply the divergence theorem to the first of Gauss’ 
laws (8), getting 

ffcv.BdV-i.- JfJ r QiV 

whence, since V is arbitrary, 

(13) V-D - Q 

In the same way, by applying the divergence theorem to Gauss’ 
second law (9), we find that 

ffj v v- BiV-0 

and, therefore, since V is arbitrary, 

(14) V • B = 0 

Now, if we take the curl of Eq. (10), we obtain 

v X (V X E) = -V x| = - 1 (V X B) = (V X H) 

If we expand the term V x (V x E) by means of Eq. (20), Sec. 
12.3, the last equation becomes 

V(V-E) - V*E = - M ~(V XH) 

and, substituting for V x H from (12), 

(15) *<*.*) -*B 

Now, if the space charge density Q is zero, as it is to a high degree 
of approximation in both good dielectrics and good conductors, 
then from (13) and the relation D = «E we see that 

V.E = 0 



FURTHER APPLICATIONS 


Therefore, Eq. (15) reduces to 

3 2 E , 3E 
V-E - miF + m — 

which is Maxwell’s equation for the electric intensity vector E. 
Similarly, if we take the curl of Eq. (12) we obtain 

Vx(VxH) - Vx^E + a|j 

and, expanding the left-hand side, 

V(V • H) - V 2 H = <rV X E + eV X ~ 

Ot 

= <rV x E + (V x E) 

ot 

Now, substituting for V x E from (10), we have 
c^B\ 
di 2 ) 

But B — by definition. Hence, (14) implies that V * H = 
therefore, the last equation reduces to 




V 2 H 


a 2 H , dR 
- ne-^r+nor- 


dt 


which is Maxwell’s equation for the magnetic intensity vector H. 

For a perfect dielectric, «r = 0. Hence, in this case Maxwell’s 
equations reduce to the three-dimensional wave equations 
dm , __ dm 

s ai> and vm = ia W 


vm ■ 


On the other hand, in a good conductor the terms arising from the 
displacement current, i.e., the terms containing the second time 
derivatives, are negligible, and Maxwell’s equations reduce to 

V 2 H = jUff - H 


V 2 E • 


dE 
r dt 


and 


dt 


which are examples of the three-dimensional heat equation. 

As a final application of the methods of vector analysis, we 
shall investigate the question of whether or not a solution of the 
heat equation satisfying prescribed boundary and initial condi- 
tions over a given region is necessarily unique. In our discussion 
of boundary value problems in Chap. 8 we proceeded on the 
assumption that this was the case. Nevertheless, examples have 
been given* of solutions of the one-dimensional heat equation 


2 ^ U 
~m '' 


d*u 

dx* 


* See, for instance, P. C. Rosenbloom and D. V. Widder, “A Temperature 
Function which Vanishes Identically,” Am. Math. Monthly, vol. 65, p. 607, 
October, 1958. 


592 


VECTOR ANALYSIS 


CHAP. 12 


which possess derivatives of all orders for all values of x and i, 
satisfy identical initial conditions everywhere on the entire ai-axis, 
and yet are different! Confronted with such a clear-cut failure of 
intuition, we must regard the uniqueness question as of more than 
academic interest and any positive result as having important 
practical significance. 

Let us suppose, then, that we are to solve the three-dimen- 
sional heat equation 



throughout a region V bounded by the closed surface S, subject 

to the boundary condition 

u — f(x,y,z,t) on S 

and the initial condition 

u(x,y,z, 0) = g(x,y,z) throughout F 

Furthermore, let us suppose that we have two solutions of the 
problem, U\ and u Zy each of which, with its derivatives through 
the second, is continuous in F. 

If we define a new function 
w(x,y,z,t) = uz{x,y,z,t) — ui(x,y,z,t) 

it is clear from the linearity of the heat equation that w also 
satisfies this equation. Moreover, w obviously assumes boundary 
and initial conditions which are identically zero. Finally, w is 
continuous and differentiable, since it is the difference of two 
functions with these properties. 

Now consider the volume integral 

(16) J(t) « H f ( [ v io*( X> y,z,t) dV t> 0 

Clearly, J(i) is a continuous function which is always equal to or 
greater than zero, since its integrand is everywhere nonnegative. 
Also, since w - 0 when t = 0, it follows that .7(0) = 0. Now 

J'(l)=yf[ r 2«:f t dV 

and, thus, since w satisfies the heat equation, we have 

(17) J'(t) - I Jff y wV*-wdV 

To this, let us apply Eq. (7), Sec. 12.5, with both u and v in the 
formula taken to be the function w of the present problem. Then 

(!8) JfJ y (w^w + Vw • Vw ) dV = JJ s N • w Yiv dS 

Since the function xo vanishes identically on S, the integral on the 
right side of (18) is zero, and we have 

JJJ V wV 2 w dV — - fJJ v Vw-VwdV 


SEC. 12.6 


FURTHER APPLICATIONS 


593 


■(*)']• 


Hence, substituting into (17), 

J'(t) = ~~ fff y VwVwdV 

which shows that 
J'(t) '^0 for t ^ 0 

Now, by the law of the mean, 

~ ,7 - (Q ) - = J'(h) 0<U<t 

or J{t) = J(0) + tJ'ih) 0 < h < t 
But we have already verified that J(0) = 0. Hence, the last 
equation reduces to 
J{t) - tJ'ih) 
which shows that 


(19) J(t) ^ 0 for t ^ 0 

since we have just proved that J'it) is nonpositive for t gt 0. How- 
ever, as we observed earlier, the definition of J(t) shows that 

(20) Jit) ^ 0 for t ^ 0 

The only way in which the inequalities (19) and (20) can simul- 
taneously be fulfilled is for J it) to be identically zero. But this is 
possible if and only if the integrand of J it) vanishes identically. 
Hence, 

wix,y,z,t) = Uzix,y,z,t) — uiix,y,z,t) = 0 
or uzix,y,z,t) — uiix,y,z,t) 

Thus in bounded regions, twice differentiable solutions of the heat 
equation satisfying prescribed surface and initial temperature con- 
ditions are unique. 

EXERCISES 

1 What is the potential function for a central force field in which the attraction on a particle 
varies directly as the square of the distance from the origin? inversely as the distance from 
the origin? 

2 What is the potential function of the force field due to uniform rotation about the 2 -axis? 

3 What is the potential function for the gravitational field of a uniform circular disk at any 
point on the axis of the disk? 

4 What is the potential function for the gravitational field of a uniform sphere of radius a 
and mass Ml Show that the attraction of the sphere at a point P a distance r from the center 
of the sphere is 

r g a 
r^a 



594 


VICTOR ANALYSIS 


CHAP. 13 



6 Show that the electrostatic field intensity at a point P due to a set of charges q< is equal to 




where Hi is the vector from the point P to the point Pi where the charge g; is located. 
Verify that V • E = 0 in this case. 

Show that the work done in bringing a charge of strength q from infinity to a point at a 
distance of r 0 from a fixed charge q 0 is qqo/ro. Using this result, determine the total energy 
in the electrostatic field defined by the fixed charges q\,qi, . . . ,q n whose mutual distances 
are r»> 

If a conductor is defined to be a body in whose interior the electric field is everywhere zero, 
show that any charge on a conductor must be located entirely on its surface. 

Let V i and F 2 be two regions with respective dielectric constants ei and « 2 , arid let S be the 
surface of discontinuity which separates them. By applying Gauss’ law for electric fields 
to a closed cylindrical surface of infinitesimal height whose bases are parallel to S in 
the respective media, show that, if there are no charges on S, the normal component of the 
electric flux density is continuous across <S. Similarly, by applying Faraday’s law to a 
rectangle of negligible width whose longer sides are parallel to S in the respective media, 
prove that, if the field is conservative, the tangential component of the electric intensity is 
continuous across S. 

What is the electric field in the empty space between the perfectly conducting, infinite 
planes y = 0 and y = l if q = i + k and —■ = i — k? (Hint: From the nature of 

the region of the problem and the initial conditions, it is clear that the field has no component 
in the ^-direction and that B x and E z are functions only of y.) 

Prove that a solution of the heat equation, possessing continuous second partial derivatives, 
which takes on prescribed initial values throughout a region V and whose normal derivative 
takes on prescribed values on the surface S which encloses V is unique. 


CHAPTER THIRTEEN 


Tensor Analysis 


13.1 

Introduction In Chap. 10 we introduced the concept of a vector as either a 
(1 ,n) or an (w, 1) matrix, that is, as an ordered set of n quantities. 
In the last chapter we took a somewhat less abstract point of view 
and regarded a vector as a quantity which could be represented 
by a directed line segment. Using this interpretation we then 
developed the algebra and calculus of vectors. In doing this, we 
worked implicitly (and sometimes explicitly) in a rectangular 
frame of reference; nonetheless, it should be clear that we were 
dealing with quantities independent of any particular coordinate 
system. For example, though the description of a point, that is, its 
coordinates, may change from one coordinate system to another, 
the point is recognizably the same in all coordinate systems. 
Similarly, although the formula by which it is computed may 
change, the length of a particular vector must be the same in all 
coordinate systems. 

In this chapter we shall pursue further this idea of invariance 
and adopt as our fundamental idea of a vector the concept of a 
quantity invariant under any transformation of coordinates. This 
will lead us to the idea of the covariant and eontravariant repre- 
sentation of vectors and, thence, to the highly important concept 
of a tensor. Although we cannot undertake a detailed discussion 
of tensor analysis, we shall undertake to indicate some of its 
principal features and illustrate the remarkable economy of the 
tensor notation. 


13.2 

Oblique coordinates 

Because of the need to distinguish between what we shall soon 
refer to as covariant and eontravariant vectors, it is necessary that 



596 


TENSOR ANALYSIS 


CHAP. 13 


our notation employ indices not only in the familiar subscript 
position but in the superscript position as well. In tensor analysis 
this requirement takes precedence over the usual exponential 
symbolism, and, henceforth, when we write, say, 

a will be a distinguishing index, like subscripts heretofore, and 
not an exponent. If and when we wish to indicate the ccth power 
of a quantity £ we shall always use parentheses and write 

(€)“ 

With this convention in mind, and as a relatively simple 
example of the generalized coordinates we shall subsequently in- 
vestigate, let us consider a system of coordinates (x 1 ,# 2 ,# 3 ) con- 
nected with a system of rectangular coordinates (x l ,x 2 ,x 3 ) by the 
equations 

x l — d n x 1 + a ux* + ai 3 .r 3 an a 12 a 13 

x 2 = anx 1 + a 22 x 2 -f a 23 x 3 \A\ = a n a 22 a 23 0 

x 3 = dux 1 + a 32 x 2 + a 33 .r 3 a 3i a 32 a 33 

or, in matric form, 

(1) X = AX 
and 

(2) X = A~ X X 
where, as usual, 


X 1 

X 2 

and 

Xf = 

:| 

X 3 



x 3 || 


The locus of points for which x 1 — 0 is, of course, the plane 

ir x : dnx 1 + ai 2 x 2 + ai 3 x 3 = 0 

Similarly, the locus of points for which x 2 = 0 is the plane 
7 r 2 : a^a; 1 + a 22 x 2 -f- a 23 x 3 == 0 

and the locus of points for which x 3 = 0 is the plane 
tt 3 : dnx 1 + a 32 x 2 •+• a 33 x 3 = 0 

Clearly, on the line of intersection of 7r 2 and tt 3 , both x 2 and x 3 are 
zero and x 1 alone varies. This line can, therefore, be thought of as 
the x'-axis. In the same fashion we can identify the line of inter- 
section of 7 t i and 7r 3 as the x 2 -axis, and the line of intersection of 
ttj and 7 t 2 as the x 3 -axis (Fig. 13.1a). Since the point for which 
x 1 — x 2 = x 3 = 0 obviously lies in xi, t 2 , and 7 t 3 , it follows that 

t Not only the new coordinates themselves, but all quantities referred to 
the new coordinate system we shall consistently denote by overbars. Thus, 
if P is the name of a point described in the.original (rectangular) coordinate 
system by the coordinates (p 1 ,? 2 # 3 ), then P is the name we shall use for this 
point thought of as described by the new coordinates {p x ,p 2 ,p 3 ) determined 
by Eq. (1). 


SEC. 13.2 OBLIQUE COORDINATES 597 



FIGURE 13.1 

A rectangular and an oblique coordinate system with their related reference vectors. 


the x- 1 -, x 2 -, x 3 -axes are concurrent. Moreover, since |A| ^ 0, 
these lines are distinct and noncoplanar. In general, however, they 
will not be mutually perpendicular, and for this reason they are 
said to be the axes of an oblique coordinate system. 

Since the x 1 -, x 2 -, and x 3 -axes are noncoplanar, any vector 
can be expressed as a linear combination of arbitrary reference 
vectors along the three oblique axes. By analogy with the unit 
vectors i, j, k, or ej, e 2 , e 3 , as we shall now denote them, it might 
seem natural to choose vectors of unit length for this purpose. 
However, because the oblique coordinates x 1 , x 2 , x 3 are not 
distance measures along the oblique axes, as x 1 , x 2 , x 3 are along 
the axes of a rectangular coordinate system, it turns out to be 
more convenient to take the new reference vectors §i, e 2 , e 3 to be, 
respectively, the vectors from the origin to the points whose 
oblique coordinates are (1,0,0), (0,1,0), and (0,0,1) (Fig. 13.16). 

To determine the lengths of the reference vectors e h e 2 , e 3 
and to obtain the formula for measuring distances in general in 
oblique coordinates, let us consider the vector Vf extending from 

P 1 

the point P whose matrix of oblique coordinates is V p — p 2 to the 

f % 

9 1 

point Q whose matrix of oblique coordinates is Vq = q 2 ■ From 

- ^ 

Eq. (2) it follows that the rectangular coordinates of P (= P) and 

t In this chapter we shall use boldface symbols to denote vectors only when 
we are considering them as directed line segments, as we did in the last 
Chapter. In particular, Jf V is a vector considered in the geometric sense, we 
shall use the symbol V to denote not the length of 1? but rather the matrix 
of the components of V along the appropriate set of axes. 


SOS TENSOR ANALYSIS CHAP. 13 

Q ( — Q) are defined, respectively, by the matrices 
V P = A-Wp and V Q = A~W Q 

Hence, in rectangular coordinates, the vector V = Yq — Y P 
(Fig. 13.2a), defined by the matrix of components V = Vq — Vp, 



A vector V represented in each of two coordinate systems. 

becomes the vector V = Vq — Vp (Fig. 13.26) defined by the 
matrix 

V - V Q - Vp - A-iVq - A-Wp = A~KVq - Vp) = A 
Now, in rectangular coordinates, the square of the length of a 
vector V whose matrix of components is V is given by the formula 

V • V = V T IV = V T GV 

where, for later convenience, we have introduced G as another 
name for the matrix which is / in this case but not in general. 
Therefore, since we require the length of a given vector to be the 
same in all coordinate systems, we define the scalar product of a 
vector with itself in oblique coordinates by the condition that 

V • V = V • V= ( A-W) T I(A~ l V ) = V T (A- l ) T IA~ 1 V 

= V T [(A- l ) T A~ 1 )V 

(3) = fi'GV 

Similarly, for distinct vectors 0 and V, we define 

. V = U * V * (A-tyyiCA- 1 ?) = U T (A~ 1 ) T IA- 1 V 
= U T [(A~ l ) T A- l \V 

(4) = U T GV 

Thus, the metrical properties of space, which in rectangular coor- 
dinates are determined by the identity matrix I — G, are in oblique 
coordinates determined by the matrix {A~ 1 ) T A~ l — G, where A is the 


SEC. 13.2 


OBLIQUE COORDINATES 


599 


(5) 


(6) 


matrix of the transformation X — AX from rectangular to oblique 
coordinates. 

Denoting by fa the element in the ith row and jth column of 
the matrix G — (A -1 ) T A~ 1 , it is clear from Eqs. (3) and (4) that, 
for the reference vectors Si, e 2 , e 3 defined by the matrices 


1 

0 


0 


0 

e 2 = 1 

eg = 

0 

we have 

0 

0 


1 



e* • 6/ = Qa 

In particular, the lengths of Si, S 2 , and S 3 are, respectively, 
l©i| - VKi |e 2 | = s/Jfi |e 3 | = 

In other words, the length of e, is such that, if R is the vector extending 
along the x'-axis from the origin to the point for which x { — a\ then 
the relation |R[ = |a € { • |e»| holds. 

In a rectangular coordinate system a unique set of directions 
for the reference vectors is clearly identified by the axes of the 
system. In oblique coordinates this is not the case; for, although 
the oblique axes certainly define a set of directions in which 
reference vectors can naturally be chosen, there is another set 
distinct from the first which is also intrinsic in the system, namely, 
the directions perpendicular to the coordinate planes t\, n, 7t 3 . 
As base vectors in these directions it is customary to take vectors 
e 1 , e 2 , e 3 defined by the conditions 

l 1 ^ = 3 

For i ¥■ j these relations fix the directions of the new reference 

vectors, and for i — j they determine their lengths and sense. 
The vectors e 1 , e 2 , e 3 are said to form a set reciprocal to the set 
ei, e 2 , e g , and vice versa* (Fig. 13.3). From their definition it is 


FIGURE 13.3 
The base vectors 
5i, § 2 , and the 
reciprocal base 
vectors e 1 , S s , e 3 
in an oblique 
coordinate 
system. 



* It is evident that in rectangular coordinates the set of base vectors and the 
set of reciprocal vectors are the same; that is, i = e t ~ e l , j = e : = e 2 , 
k s e 3 = e 3 . It is for this reason that the concept of reciprocal sets of vec- 
tors was not introduced in the last chapter (except in Exercise 23, Sec. 12.1). 



600 


TENSOR ANALYSIS 


CHAP. 13 


clear that e 1 , e 2 , e 3 are noneoplanar and, hence, can be used as a 
basis for the representation of any vector. Thus, when we use 
oblique coordinates, any vector V has two different but equally 
natural representations: It can be expressed as a linear combina- 
tion of the base vectors ei, e 2 , e 3 , or it can be expressed as a linear 
combination of the vectors of the reciprocal set e 1 , e 2 , e 3 . 

In particular, the vectors in each of the sets 61, e 2 , e 3 and 
e 1 , e 2 , e 3 must be expressible as linear combinations of the vectors 
in the other set. Specifically, if we write 

61 = Mue 1 + jui 2 e 2 + M^e 3 
e 2 = + M22© 2 + m 23e 3 

§3 — Hue 1 + HS2Q 2 + M33e 3 

and then form the scalar product of each side of the ith equation 
with «b, we obtain 

e i • Hj - Mue 1 • e y + W2 e 2 • ■ S } - 

Hence, using Eqs. (5) and (6), we find 
ffa = m 
and, therefore, 

= gnQ 1 + gne 2 -f §ue 3 
(7) e 2 = fine 1 + <?22e 2 + ^23§ 3 

§3 = ^ e 1 + 032§ 2 + ? 33§ 3 


If we define the matrices 



Eq. (7) can be written more compactly in the form 

( 8 ) V e - 

from which it follows that 

(9) V* = G-W e 
or 

e 1 = g n e x + g™e 2 + <pe 3 

(10) e 2 = g*^ + </ 22 e 2 + £ 23 e 3 

e 3 = <7 sl e x + g si e 2 + <7 33 e 3 

where g ij is the element in the ith row and j'th column of G~ l \ 
that is, 



SEC. 13.2 


OBLIQUE COORDINATES 


601 


Of course, since G — \\gnW = (A~ 1 ) T A~ 1 is symmetric, so is its 
inverse G _1 = \\g i3 \\ = AA T . From (10) and (6) it follows im- 
mediately that 

( 11 ) &■& = §'} 

Thus, in oblique coordinates the metrical properties of space, which 
are determined by the matrix G — 1 1^,11 = {A~ l ) T A~ l if vectors are 
represented in terms of the base vectors fix, e 2 , e 3 , are determined 
equally well by the inverse matrix G~ l = \\g i3 '\\ = AA T if vectors are 
represented in terms of the reciprocal base vectors e 1 , e 2 , e 3 . 

It is also instructive to consider the representation of the vec- 
tors i = ex = e 1 , j = e 2 = e 2 ,k = e 3 = e 3 in terms of the vectors 
ei, e 2 , e 3 and e 1 , e 2 , e 3 , and vice versa. Specifically, since ex, e 2 , e 3 
are, respectively, the vectors from the origin 0 ( = 0) to the points 
whose oblique coordinates are (1,0,0), (0,1,0), and (0,0,1) and 
since, from the transformation equation X = A~ l X, these points 
have rectangular coordinates (a 11 , a 21 , a 31 ), (a 12 , a 22 , a 32 ), and 
(a 13 , a 23 , a 33 ), where a 1 ' 3 ' = Aj,-/|A| is the element in the ith row and 
jih column of the matrix A -1 , it follows that 

ei — a u ei a 21 © 2 H - o 31 e 3 

(12) e 2 = a 12 ex + a 22 e 2 + a 32 e 3 

e 3 = o 13 ex + a 23 e 2 + a 33 e 3 
or, introducing the matrices 


ex 


i 

ex e 1 

e 2 

and 

j 

= e 2 = e 2 

©3 


k 

e 3 e 3 


(13) f e = (A“ 1 ) r F 8 = (A T )- l V e 

Either in the same fashion or directly from (13), we obtain 

(14) V e = A T V C 
that is, 

©1 = auSx -f- a 2 xe 2 u 3 x©3 

(15) ©2 == Ui 2 ©i + u 22 e 2 + u 32 © s 
e 3 = ox 3 ex -j- a 23 e 2 + a 33 S 3 

as the equations expressing i = ei, j = e 2 , k = e 3 in terms of the 
base vectors ex, e 2 , e 3 of the oblique system. 

To obtain the equations relating e 1 , e 2 , e 3 and i = e 1 , j = e 2 , 
k = e 3 , we begin with the relation (9), i.e., V e — G~ 1 V e . From 
this, using (13) and the fact that G -1 = AA T and V e - V e , we 
have 


( 16 ) 


F* = (AA T )(A T )~ 1 V e = AV e 


602 


TENSOR ANALYSIS 


CHAP. 13 


that is, 

e 1 = aue 1 + ai 2 e 2 + a^e 3 

(17) e 2 = a^e 1 + a 22 e 2 4" fl23fi 3 
e 3 = o 3 le 1 -f ci32e 2 4- c^e 3 

Solving (16) for V e , we have, of course, 

(18) V = A-W* 
or 


(19) 


( 20 ) 


( 21 ) 


( 22 ) 

(23) 


(24) 

(25) 


e 1 = a n e x + a 12 e 2 + a 13 e 3 
e 2 = a 21 ^ 1 + a 22 e 2 + a 23 e 3 
e 3 = a 31 e x + a 32 e 2 4* a 33 e 3 

Suppose now that we have a vector 

y = V r - ui + vj 4 - wk = i^ei 4- « 2 e 2 4- v 3 e 3 ss yie 1 4- v 3 e 2 + y 3 e 3 = V r 

where, since V is given in a rectangular coordinate system, e» = e 1 ' 
and V r = V r . If we express V 5= Y r in terms of the base vectors 
61, e 2) § 3 of the oblique system by means of (15), we obtain, after 
collecting terms, 

Y r = (wkin 4" v 2 a,i2 4" w 3 ai 3 )§i 4" (t ,1 02i 4* v 2 azz 4- v 3 azs)®2 

4- (w 1 a 3 i 4- v 2 a 32 4- v 3 aw)e3 

= 0% 4- S 2 e 2 4- « 3 e 3 

Similarly, if we express V s V, in terms of the reciprocal base 
vectors S 1 , e 2 , e 3 by means of (19), we obtain the representation 
V r - (via, 11 4- v 2 a 21 4* vza^e 1 4- (via 12 + v 2 a 22 4- i> 3 a 32 )§ 2 

4- (24a 13 4* vz a u 4- v 3 a 33 )& s 

— V& 1 + S 2 e 2 4- v 3 e 3 

Thus, when V is transformed from its representation in terms 
of the base vectors ei, e 2 , e3 to the corresponding representation 
in terms of the base vectors §1, e 2 , e 3 , the components of ¥ = V r 
transform according to the law 

v l = anv 1 4- dizv 2 4- a*3» 8 
or 

V r = AV r 

Likewise, when V = V r is transformed from its representation in 
terms of the base vectors e 1 , e 2 , e 3 to its corresponding representa- 
tion in terms of the reciprocal base vectors e 1 , e 2 , e 3 , its components 
transform according to the law 
Vi = 1 4- a 2{ vz 4- a% 3 

or 

V r = (A-^Vr 


OBLIQUE COORDINATES 


Equations (24) and (25) have exactly the same form as Eqs. (12) 
and (13) for the transformation of the base vectors ei, e«, e 3 ; for 
this reason, the representation of V in terms of the reciprocal base 
vectors is called the covariant representation* of V. On the other 
hand, Eqs. (22) and (23) have the form of Eqs. (16) and (17) for 
the transformation of the reciprocal base vectors e 1 , e 2 , e 3 ; for 
this reason, the representation of V in terms of the base vectors 
themselves is called the contravariant representation f of V. 

From Eq. (1) it is clear that 
dx* 

and from Eq. (2) it is clear that 

.. dx i 
a'> = — . 
dx 3 

There is no particular reason for introducing this notation in the 
study of oblique coordinates, but it may be helpful as a prepara- 
tion for the work of the next section on generalized coordinates to 
rewrite some of the important formulas of this section in terms of 
the partial derivatives of the transformation equations. 

For the matrix of the transformation and its inverse we have, 
respectively, 

j §g|| and 


a = Ml = 


For the general element'of the matrix G = (A~ 1 ) 7 A~ l , which 
in oblique coordinates defines the metrical properties of space, we 
have 




or, inserting the factor g k i which of course is 1 if k - 
k 7* l , 


l and 0 if 


/ oc s _ v ax ax 

(28) 

Likewise, for the matrix G~ l 
9 ” l aafl4h ~ l dx* dtf 


■ Ill'll = AA T , we have 


( 27 ) 


or, inserting g kl ss g kl , 


gii = ^ Q kl 


dx* d& 


* Co- = with or alike. 
f Contra- = against or opposite to. * 



604 


TENSOR ANALYSIS 


CHAP. 13 


For the relations connecting the base vectors §1, e 2 , §3 and 
the vectors ei, e 2 , e 3 , we have from (12) and (15), 

(28) = = 

and 

(29) = 

For the relations connecting the reciprocal base vectors 
e 1 , e 2 , e 3 and the vectors e 1 , e 2 , e 3 , we have from (17) and (19), 

(30) e i = ^ aikek = Z efc ^ 

k k 

and 

(3« = = 

For the components of a vector represented covariantly, 
we have from the law of transformation (24), 

( 32 ) ^ = 2 a k % = ^ Vk W 

k k 

For the components of a vector represented contravariantly, 
we have from the law of transformation (22), 

(33) 

If we have a general transformation of coordinates, say 
x' = &(x l ,x a ,x a ) i = 1 , 2 , 3 

then any vector whose components transform according to the law 
(32) is called a covariant vector, and any vector whose compo- 
nents transform according to the law (33) is called a contra- 
variant vector. In rectangular coordinates, as we pointed out 
earlier, the base set ei, e 2 , e 3 and the reciprocal set e 1 , e 2 , e 3 are 
identical. Hence, there is no distinction between covariant and 
contravariant vectors, and no need to introduce the two concepts, 
in elementary vector analysis. 

EXERCISES 

1 Prove that, for any nonsingular matrix 4, the product G - (A -1 ) 7 /! -1 is symmetric. 

2 What is the condition that the set of base vectors Si, ea, e 3 and the set of reciprocal vectors 
e 1 , e s , e 3 be the same? 

8 a Let as 1 , x", x 3 and x 1 , x\ x 3 be, respectively, rectangular and oblique coordinates connected 
by the transformation equations 

x 1 = 2a: 1 + x 3 


x 1 + 2x 3 + Zx 3 
x 1 + x*+ x 3 


SEC. 13.3 


GENERALIZED COORDINATES 


ISOS 


Working directly from their definitions, determine the rectangular representation of the 
base vectors e,. §., e 3 and the reciprocal vectors e l , e 2 , e 3 . Thence verify that Eqs. (12) and 
(17) are satisfied. 

b Work part a if the matrix of the transformation to oblique coordinates is 
1 2 - 1 ] 

A — 2 1 3 

1 1 1 1 

4 a In Exercise 3a, what is the distance from the origin to the point whose oblique coordinates 
are (1,1,1)? What is the distance between the points whose oblique coordinates are (1,1,1) 
and (1,2,3)? 

b In Exercise 3b, what is the distance from the origin to the points whose oblique coor- 
dinates are (1,0, — 1) and (2,1,1)? 

5 a If U and 1? are two vectors represented contravariantly in an oblique coordinate system 
connected with a rectangular coordinate system by the transformation X = AX, show that 
the angle between U and V is given by the formula 

U T GV 

COS 8 = 

V U T GU v V T GV 

b What is the angle between two vectors fir and ¥ represented covariantly? 


13.3 

Generalized coordinates 

Let x l , x 2 , x 3 be three independent, single-valued, differentiable 
scalar point functions such that to every point of some region 
R of three-dimensional euclidean space there corresponds a 
unique triple of values {x l ,x 2 ,x 3 ), and such that to every triple 
of values ( x 1 ,x 2 ,x 8 ) within ranges determined by the nature of R 
there corresponds a unique point of R. Then x l , x 2 , x 3 are called 
generalized coordinates in R, and the correspondence between 
the points of R and the number triples {x l ,x 2 ,x 3 ) is called a 
generalized coordinate system for R. Rectangular, cylindrical, 
spherical, and now oblique coordinates are familiar examples of 
generalized coordinates. 

Through each point P of R there passes a unique surface S l 
on which x l is constant, a unique surface S 2 on which x 2 is con- 
stant, and a unique surface S 3 on which x 3 is constant. These sur- 
faces intersect by pairs in curves, called parametric curves, which 
pass through P and on which one and only one of the generalized 
coordinates varies. Under the assumptions we have made about 
x 1 , x 2 , x a , it can be shown that at each point of R the tangents to 
the parametric curves which pass through that point are non- 
coplanar. In general, the tangents to the parametric curves will 
vary in direction from point to point, and no one set of directions 
is singled out as any more natural than any other for the directions 
of a set of base vectors for R. However, at each point, vectors along 
the tangents to the parametric curves through that point provide 
a natural basis for the representation of vectors extending from 


TENSOR ANALYSIS 


The parametric 
curves and the 
local base vectors 
at a point P in a 
generalized 
coordinate 
system. 




ft’-. 


x 2 ,x a constant 
x 1 variable 


that point as origin (Fig. 13.4), and our development will be based 
on this concept of local base vectors and, of course, the related 
concept of local reciprocal base vectors. 

The local base vectors ei, e 2 , e 3 at any point P we define to 
have, respectively, the directions of the tangents to the x 1 -, x 2 - , 
x 3 -parametric curves at P, and to have lengths |e,| = s/ e,- • e,-, 
such that, if ds is the infinitesimal distance along the ^-parametric 
curve corresponding to the infinitesimal change dx 4 in x\ then 
ds = [e,- dx 1 1 = -\/ e > * e « Idas 4 ! 

At P we define the local reciprocal base vectors e 1 , e 2 , e 3 precisely 
as we did in oblique coordinates, namely, by the conditions 

e 4 .e,= (° 

( 1 % = J 

where, as usual, e 4 • e,- = je 4 | je,| cos (e 4 ,e,). 

Since our definitions for the local base vectors and the cor- 
responding reciprocal vectors involve the notion of length, we 
must, of course, have some method of measuring distances. To do 
this, we assume the existence throughout R of a positive-definite 
matrix 


whose elements are functions of the generalized coordinates and 
which has the property that, if 
II dx 1 II 
dX = dx* 

I dx* || 

then the distance ds from P: (a: 1 , a;*, a*) to Q: (x 1 + dx 1 , x 2 + dx 2 , 
x 3 dx 3 ) is given by the formula 
(ds) 2 = (dX) T G(dX) = J i g ij dx i dx* 


( 2 ) 




SEC. 13.3 


GENERALIZED COORDINATES 


607 


Thus, if a; 1 = x l (t), x 2 = x 2 (i), x 3 = x 3 (t) are the parametric 
equations of a curve, then the length of the curve between the 
points Pi and P 2 at which t has the values h and U, respectively, 
is 


Ip,' * " VX dx ‘ dx> “ r ^ 


da:* drc 3 


di 


,, di dt 

In particular, for the length of an arbitrary infinitesimal vector 
ei da: 1 + e 2 da; 2 + e 3 dx 3 
we have 


(ds) 2 = (ei da: 1 + e 2 dx 2 + e 3 da: 3 ) • (ei da: 1 + e 2 da; 2 + e 3 d.r 3 ) 

= X e « " ^ dxf — X ^* 3 ‘ 

hi i,i 

Since the differentials of the coordinates are independent and 
arbitrary, the coefficients of corresponding terms in the last two 
sums must be identical; thei-efore, 

(3) 6j • e, = Qa 


In particular, 

(4) M = V e £ • e» = \/gu 

Using (3) and (4), we can determine without integration the 
length of a noninfinitesimal vector V = v l ei + t ,z e 2 + « 3 e 3 ex- 
pressed in terms of the base vectors at a point P. In fact, 

|V| 2 — V * V = (v*ei *f y 2 e 2 4- v 3 ef) • (edei + t ,2 e 2 + z> 3 e 3 ) 

- X e, • e } v¥ = J g i} v¥ = F r GF 

Hence, 

(5) jV| = -s/FW 


(6) 

(7) 


where, of course, the elements g% of G are to be evaluated at the 
point P at which ei, e 2 , e 3 are the base vectors. 

From (3) we also draw the important conclusion that a neces- 
sary and sufficient condition that the parametric curves be orthogonal 
at every point of R is that g%j = 0 for i j at all points of R. 

By exactly the same reasoning we used to derive Eqs. (7) 
and (10) in the last section we can. now prove that, for the local 
base vectors and the local reciprocal base vector's, we have the 
following relations: 

e» = X 
V 

e* = ^ g ik Qk 

where, as in the last section, g ik is the element in the ith row and 
&th column of the matrix C? -1 which is the inverse of G = ||0«||. 
Furthermore, by forming the scalar product of Eq. (7) with e 3 
and using the definitive relation e 3 • e*, = 8if, where V is the 


608 


TENSOR ANALYSIS 


CHAP. 13 


(8) 


(9) 


( 10 ) 


( 11 ) 


( 12 ) 


Kronecker delta,* we obtain the following companion result to 
Eq. (3): 
e* • e 3 — g i} ' 

Equation (7) is not the only formula which can be used to 
express the local reciprocal vectors in terms of the local base 
vectors. Specifically, it is easy to verify that e 1 , e 2 , e 3 are given in 
terms of ei, e 2 , e 3 by the formulas 
e i = e * X ea e2 = e 3 xe x fi3 = ex X e 2 
[eie 2 e 5 ] [exe 2 e 3 ] [exe 2 e 3 ] 

Hence, using the result of Exercise 22, Sec. 12.1, 

fe^e 3 ! = e2 x e3 . e « * e i x e i X e 2 _ [eie 2 e 3 ] 2 
[eie 2 e 3 ] [exe 2 e 3 ] [exe 2 e 3 ] [exe 2 e 3 ] 3 
1 

[exe 2 e 3 ] 

Moreover, using Eq. (6) in conjunction with Eq. (10), the numeri- 
cal value of [exe 2 e 3 ] (and hence of [e 1 e 2 e 3 ]) can easily be found. 
For, by (6), 

[exe 2 e 3 ] = (JJ gut 1 ) • (£ gy&) x (J g 3k e k ) 

i j k 

= J gugtjg3k[&&e k ] 
id, In 

Now, of the 3 s = 27 terms which arise as i, j, and 1c range inde- 
pendently over the numbers 1, 2, 3, twenty-one are zero, because 
the scalar triple product [eVe fc ] contains at least one repeated 
factor. Of the remaining six terms in the last sum there are three, 
corresponding to the sets of values (i,j,k) — (1,2,3), (2,3,1), (3,1,2), 
in which 

[e { e } e*] = [eVe 8 ] = r — — 

J [exe 2 e 3 ] 

In the remaining three terms, corresponding to the sets of values 
(• i,j,k ) = (1,3,2), (3,2,1), (2,1,3), the factor [e^'e*] is equal to 

- [e ‘ eVI “~id5a 

Hence, factoring l/[eie 2 e 3 ] from the sum in (11) and cross- 
multiplying, we have 

[eie 2 e 3 ] 2 = gngngn + gx-tgngn 4- gngngn 

~ QnQwQn ~ <713<722<731 — <712<721<733 
Since the sum on the right in the last equation is precisely the 
expansion of the determinant of the matrix G = ||<7 t v||, we have 
thus established the useful result 

[exe 2 e 3 ] 2 = |(?| 


Defined at the end of See. 10.1. 


SEC. 13.3 


GENERALIZED COORDINATES 


where G is the matrix which defines the metrical properties of 
space. 

We now turn our attention to transformations from one set of 
generalized coordinates to another. In particular, we are interested 
in the laws of transformation for the fundamental matrices G and 
G~\ the local base vectors ei, e 3 , the local reciprocal vectors 
e 1 , e 2 , e 3 , and vectors expressed in terms of these reference vectors, 
which are induced by a transformation of coordinates. 

Let us suppose, then, that we have two systems of coordinates 
(x*,x 2 ,x 3 ) and (x*,x 2 ,x 3 ) connected by transformation equations of 
the form 

x 1 = x 1 (x 1 ,x 3 i x 3 ) 
x 3 — x^x^x^x 3 ) 
x 3 bs x 3 (rc 1 ,a 2 ,a; 3 ) 
or, simply, 

(13) T: x £ = x i (x 1 ,x' l ,x v ) i — 1, 2, 3 

In particular cases, the equations (13) might be the equations 
connecting a rectangular and an oblique coordinate system, as in 
the last section ; a rectangular and a cylindrical coordinate system ; 
a rectangular and a spherical coordinate system; or a cylindrical 
and a spherical coordinate system. 

Naturally, we wish a point with coordinates (x^x^x 8 ) in the 
x-system to have a unique set of coordinates (a 1 ,* 2 ,® 3 ) in the 
x-system. Hence we require that, throughout the region R with 
which we are concerned, the x*'s be single-valued functions of the 
x v s. Moreover, we wish the point with coordinates (x 1 ,# 2 ,# 3 ) to 
have a unique set of x-coordinates. Hence, we also require that 
Eqs. (13) be solvable for x 1 , x 2 , x 3 as single-valued functions of 



(14) 


(15) 


T~ l : x* = x*(x 2 ,x 2 ,x 3 ) i = 1, 2, 3 

In advanced calculus it is shown* that, if the first partial 
derivatives of the coordinate functions x £ in T are continuous 
and if, throughout R, the so-called Jacobian determinant 


\J\ - 


d(xSx 2 ,x 3 ) t 

dix^x^x 3 ) 




* See, for instance, R. C. Buck, “Advanced Calculus,” p. 215, McGraw-Hill 
Book Company, New York, 1956. 

t Named for the German mathematician C. G. J. Jacobi (1804-1851). 
We shall frequently refer to the Jacobian matrix J simply as the Jacobian 
of the transformation T. 


610 


TENSOR ANALYSIS 


CHAP. 13 


( 16 ) 


is different from zero, as we shall suppose, then around any 
interior point of R there exists a neighborhood in which T has a 
a single-valued inverse ( 14 ). Naturally, since the equations of the 
inverse transformation ( 14 ) are to be uniquely solvable for x l , x 2 , 
X s , the Jacobian determinant of the inverse transformation 


i ri _ 3 (z l , x 2 , a: 3 ) 
" ~d{x\x\W) 


dx 1 

dx 1 

dx 1 

dx 1 

dx 2 

dx 3 

dx 2 

dx 2 

dx 2 

dx 1 

dx 2 

dx 3 

dx 3 

dx 3 

dx 3 

dx 1 

dx 2 

dx 3 


must also be different from zero throughout R. 

We have now reached the point where it is convenient, or 
indeed necessary, to introduce the so-called Einstein summation 
convention. Just as the summation symbol S effects a great 
notational economy when it is used instead of writing a sum of 
terms at length, so this summation convention replaces the symbol 
2 with a notation still shorter and much more suggestive. Briefly, 
the convention is this : If any term contains the same letter twice as a 
distinguishing index , it is understood that the term is to be summed for 
all values of the repeated index. For example, using the summation 
convention, with the understanding that the range of our indices 
is 1 to 3 , we can write the differential of x i in the equivalent forms 


dx l , , . dx 1 , „ . dx l , - V dx l , . dx 1 , . 

i dx 1 -f- t n dx 2 4 - t. dx 3 — ) -r--. dx 3 — -r— . dx 3 

dx 1 dx 2 dx 3 L dx 3 dx 3 


In the last expression, the index i identifies the particular variable 
x i whose differential is being considered, and cannot be changed. 
On the other hand, the index j merely indicates that summation 
over a certain range is to be carried out, and, like the variable 
of integration in a definite integral, can be changed at pleasure to 
any other letter except i, of course. Thus we can write equally well 


dx 1 , . dx 1 , . dx 1 , 

-- — . dx 3 = th dx k = — dx? - 
dx 3 dx k dx* 


An index which can thus be arbitrarily replaced by another is 
usually called a dummy index or an umbral index. 

The summation convention also permits more than one pair 
of repeated indices in a term to be summed. For instance, applying 
the convention first to the repeated index i and then to j, we have 

g,j dx 1 dx 3 ' = (jij dx 1 dx 3 g 2 j dx 2 dx 3 + g 3 j dx 3 dx 3 

= (gu dx 1 dx 1 + gi2 dx 1 dx 2 + g u dx 1 dx 3 ) 

+ (021 dx 2 dx 1 -f 022 dx 2 dx 2 + 0 2 3 dx 2 dx 3 ) 

+ (03i dx 3 dx 1 + 032 dx 3 dx 2 + g 33 dx 3 dx 3 ) 

= ^ 0y dx* dx 3 



SEC. 13.3 


GENERALIZED COORDINATES 


611 


It should be noted that 
ffa dx* dx* 7* ga dx i dx* 

since the latter is equal to the simpler sum 
0ii dx 1 dx l + 022 dx 2 dx 2 + g 3 z dx s dx 3 

Hence, unless the more restricted meaning is intended, the same 
index cannot be used a second time in the same term as a dummy 
index. 

Preparatory to resuming our discussion of coordinate trans- 
formations, it will be helpful to introduce several simple lemmas 
at this point: 

-/lemma 1 

If (x 1 ,x 2 ,x 3 ) and (sc 1 ,® 2 ,® 3 ) are coordinates connected by a transformation 


then 


x* = xKxW'X 3 ) 
dx* dx? _ . 

dx “ dx* ~ jl 


PROOF By hypothesis, x * is a differentiable function of x 1 , x 2 , x 3 , which in turn 
are differentiable functions of x 1 , x 2 , x 3 . Hence, by the chain rule of partial 
differentiation, 


dx* _ , _ dx* dx 1 , dx* dx 2 , dx* dx 3 _ dx* dx? 
dx x dx 1 dx* dx 2 dx* dx 3 dx* dx? dx * 


i not summed 


d, since the a^’s are independent, 


dx* 
dx* '' 


_ dx* dx 1 
dx 1 dx * " 


dx* dx 2 d& dx 3 
dx 2 dx * + dx 3 dx * '' 


dx* dx a 
dx 01 dx* 


i 7 * j 


These two relations together establish the assertion of the lemma. Of course, by 
an identical proof it follows that 


■4 


dx* dx a _ 
dx? dx* 


EMMA 2 

If <[> * and 4>* (i — 1, 2 , 3 ) are, respectively, functions of x 1 , x 2 , x 3 and x\ x 2 , x 3 , then 


= implies = V 

and conversely. 

PROOF To provide us with further insight into the efficiency of the summation 
convention, let us first prove this lemma using the more familiar 2 notation. We 
are given the relation 


, dx* _ 
dx? ’ 




d& 
dx » 


6T2 


TENSOR ANALYSIS 


CHAP. 13 


If we now multiply both sides of this equation in its second form by and then 
sum over i, we have 


i — t t=l ' 0=1 / 

or, interchanging the order of summation on the right, 

» = 1 0=1 x i = 1 ' 


Now, by Lemma 1, the inner sum on the right is equal to d/s'*. Hence, the right-hand 
side reduces to 

X 

0=1 


which is equal to zero unless 0 = a. therefore, finally, 

i. 


d& 9 dx i 9 


as asserted. 


Using the summation convention, our proof would have proceeded as follows : 
Introducing the dummy index /? in place of a, we begin with 
dx * 

* = 4*W 

dx a 

Now, multiplying both sides by —■ and using Lemma 1, we have 


.. dx a _ 
dx { 


dx? dx i ... „ 

4 * ITS = = <*>“ 


dx* dx * 

The converse assertion is, of course, established in exactly the same fashion. 
LEMMA 3 

If 4> ij and ft 3 ( i,j — 1, 2, 3) are, respectively, functions of x 1 , x 2 , x 3 and St 1 , x 2 , x s , 
then any one of the relations 

Hi Jr, n dx* dx 3 
& ~ $ dx a dx e 

r“— 

dx 3 dx 01 
t« . dx- 3 

""W - * 's? 

... dx“ . 

implies each of the others. 

PROOF Because of the near-identity of the arguments, it will be sufficient to 
establish just one of the assertions of the lemma, say the assertion that 


8 dx* dx 3 

~dx a We 


implies fr 3 


dx^dx^ 
dx ’ dx 3 




SEC. 13.3 


GENERALIZED COORDINATES 


613 


To do this, let us write the first relation using a and b in place of a and /3, and then 
dx a dx ^ 

let us multiply both sides by — . and sum over i and j. This gives us, by 
Lemma 1, 


xa __ b / dx* dx*\ / dz$ dx j \ 

dx 1 dx 3 dx a J\dx 3 dx b ) 


In Sees. 10.2 and 10.3, when we considered linear trans- 
formations such as 

T Y = AX and TV Z = BY A, B nonsingular 

we observed that the matrices of the inverse transformations Tf 1 
and T-r 1 are A -1 and J3 _1 , respectively, and that the matrix of 
the transformation resulting when Ti is followed by Ti is BA. 
Since linear transformations are obviously special cases of the 
transformation (13), it is natural to ask whether general coordinate 
transformations have comparable properties. The answer is Yes, 
and in fact we have the following theorems, the proof of the 
second of which we shall leave as an exercise. 


THEOREM 1 

If T\ x a — ^“(a: 1 ,^ 2 ,^ 3 ) is a transformation with Jacobian J, then the Jacobian J 
of the inverse transformation T~ 1 : x a — x*(x l ,x 2 ,x 3 ) is J~\ 

PROOF By definition, the Jacobian of the direct transformation T is 
J = Jj ^ JJ> and the Jacobian of the inverse transformation T -1 is J = jj ^'JJ* 

From the definition of matric multiplication, the element in the ith row and jth 
dx* dx k 

column of the product JJ is -z—j. -rr -•> and, by Lemma 1, this sum is equal to Sf. 


Hence, J = J -1 ; that is, the matrix J of the inverse transformation T 1 is the 
inverse of the matrix J of the direct transformation T, as asserted. 

COROLLARY 1 

dx* . 

If J is the Jacobian of the transformation T: x a = x a (x l ,x 2 ,x 3 ), then is equal 


to l/\J\ times the cofactor of ^ in |/|. 


'THEOREM 2 

If T\.x a — x*{;x l ,x 2 ,x 3 ) is a transformation with Jacobian J i and if 
TV F = &{x\x 2 ,x 3 ) 

is a transformation with Jacobian J 2 , then the Jacobian of the transformation 
T%T 1 is J %J 1 . 



614 


TENSOR ANALYSIS 


CHAP. 13 


(17) 


(18) 


(19) 


Let us now determine how the fundamental differential 
quadratic form (ds) 2 = gadz'dx 3 transforms when the coor- 
dinates x 1 , x 2 , x 3 are transformed into the coordinates x 1 , x 2 , x s 
by means of Eqs, (13). For dx i and dx 3 we have, of course, 


dx i = ~ dx a 
dx“ 


and dx 3 = — dxfi 
dx 13 


Hence, (ds) 2 becomes the quadratic form 
dx* dx 3 , 

Therefore, if we write the quadratic form after transformation as 


{ jap dx ? dxP 

it follows that the coefficients g a p transform according to the law 

_ _ dx * dx 3 

- 9a q$p 

as we verified in the particular case of a transformation from 
rectangular coordinates to oblique coordinates in the last section 
[Eq. (26)]. Of course, considering the transformation from x-coor- 
dinates to x-coordinates, an argument identical with the one we 
have just given provides us with the companion formula 
_ dx a dx? 


which also follows from an obvious modification of Lemma 3. 

Formula (18) also leads to the following interesting conclu- 
sion: From the rule for the multiplication of matrices, the element 
in the fth row and jth column of the product J T GJ is 
dx* - §& 
dx* 9af> dx 3 ' 

However, by (18), this is the element gq in the ith row and ith 
column of G. Hence, taking determinants, 

I dx* dx? I 

= H or \F6j\-\G\ 

or, finally, 

kl’IGI = |0| 


EXAMPLE 1 

Obtain the formula for the differential of arc length in spherical coordinates. 

Since we know the formula for the differential of arc length in rectangular coordinates, 
namely, 

(20) (ds) 3 = (dx) 3 + (dy) 2 + (dz) 2 

we can obtain the corresponding formula in spherical coordinates by transformation from 
rectangular coordinates. To do this, let x l , x 1 , x s denote, respectively, the rectangular coordinates 
x, y, z, and let x 1 , x 2 , x 3 denote, respectively, the spherical coordinates r, 6, <j> (Fig. 13.5). Then, as 



SEC. 13.3 


GENERALIZED COORDINATES 


615 


FIGURE 13.5 
Plot showing the 
relation between 
rectangular 
and spherical 
coordinates. 


usual, we have 



= x 1 sin x 2 cos x 3 
= x 1 sin S 2 sin x 3 
= x 1 cos x 2 


3a; 1 . . 

— - = sin x 2 cos x 3 
dS 1 


— — = sin x 2 sin x 3 


ax 1 _ 
ax 2 " 
ax 2 


= x 1 cos x 2 cos £ 3 


■ — x 1 cos x 2 sin x 3 


dx 3 

— = —x 1 sin x- 

ax 2 


» —x 1 sin x 2 sin x 3 


ax 2 

— - = x 1 sin x 2 cos x 5 

ax 3 


Hence, substituting into Eq. (17) and noting from Eq. (20) that ga — 8/*', we have 
gn = (sin x 2 cos x 3 ) 2 + (sin x 2 sin x 3 ) 2 + (cos x 2 ) 2 
= 1 

git = (x 1 cos x 2 cos x 3 ) 2 + (x 1 cos x 2 sin x 3 ) 2 + (— x 1 cos x 2 ) 2 

- (x 1 ) 2 

gas = (— x 1 sin x 2 sin x 3 ) 2 + (x 1 sin x 2 cos x 3 ) 2 
= (x 1 sin x 2 ) 2 

gn — (sin x 2 cos x 3 )(x 3 cos x 2 cos x 3 ) + (sin x 2 sin x 3 )(x x cos x s sin x 3 ) + (cos x 2 )(— x 1 sin x 2 ) 

- 0 

gn = (sin x 2 cos x 3 )(— x l sin x 2 sin x 3 ) + (sin x 2 sin x 3 )(x x sin x 2 cos x 3 ) 

- 0 

023 = (x* cos x 2 cos x 3 )(~x l sin x 2 sin x 3 ) + (x l cos x 2 sin x 3 )(x* sin x 2 cos x 3 ) 

= 0 

and, finally, 

(ds) 2 = fa dx* dx’ 

= (dx 1 ) 2 + (x 1 ) 2 (dx 2 ) 2 + (x 1 sin x 2 ) 2 (dx 3 ) 2 
= (dr) 2 + r 2 (de) 2 + (r sin d)\d<j>Y 


616 


TENSOR ANALYSIS 


CHAP. 13 


(21'l 


(22) 

(23) 


When the coordinates x\ x 2 , x 3 are replaced by the coordinates 
x l , x 2 , x 3 , there is, of course, a new set of parametric curves passing 
through an arbitrary point P and, hence, a new set of base 
vectors e x , ea, e 3 and a new set of reciprocal base vectors e 1 , e 2 , e ;i . 
To obtain the relations between the old and the new base vectors, 
let us consider an arbitrary infinitesimal displacement ds expressed 
in terms of each system : 
ds = dx a e„ = dx* 


Now, from the transformation equations, we have 


dx* — ~ dx? 

dX a 


and, hence, from (21), we can write 
, „ dx * . 


Now the differentials are arbitrary; therefore, the coefficients of 
corresponding differentials on each side of the last equation must 
be equal. Thus, 


Similarly, or by Lemma 2, 


dx a 

’ dx { 6 “ 


Formulas (29) and (28) of Sec. 13.2 were, respectively, of course, 
special cases of these relations. 

Knowing from Eq. (6) how the local base vectors are expressed 
in terms of the local reciprocal base vectors in any coordinate 
system and knowing from Eq. (23) how the local base vectors 
transform, we can now determine how the local reciprocal base 
vectors transform. For, beginning with the relation (6) for the 
new coordinate system, namely, 

6; = ! ia& 


and substituting from Eqs. (17) and (23), we have 


dx a dx* dx 0 .. 

dx* e “ “ 9al> 'dF i dW e3 


dx* 

Now, multiplying this equation by ~ and summing each side 
over i, we obtain 

/ dx* dx*\ (dx? dx*\ dx 0 . . 

Vaf d X y) “ 9a0 \dx* dxy) d& e 

or, using Lemma 1, 

x a s „ dx 0 . dx 0 _ . 

S 7 a e a = g af3 b* — . e J and e T = g yg & 


SEC. 13.3 


GENERALIZED COORDINATES 


61? 


(24) 

(25) 


(26) 


( 27 ) 




Now, if we multiply the last equation by g Xy and sum over 7, 
making use of the fact that |]^ ij '|| is the inverse of j|^|| and, hence, 
that g* y g y p = V» we have 

g^Qy 

and, using Eq. (7), 
x -, dx * 

= e 5 — . 
dx 1 

Similarly, or by using Lemma 2, 

. d& 

Q 3 = e \ 

dx* 

Equations (31) and (30) of Sec. 13.2 were, of course, respectively, 
special cases of these relations. 

From Eq. (8) applied to the new coordinate system, we have 

Qi . Q} - gii 

Hence, using (25), we have 

( e "^) 4 ( et ^S) =ffa 


or, since e“ 
g ij 


dx* d& 
dx a dx& 


which is the law of transformation for the g ij> s. Equation (27), 
Sec. 13.2, is a special case of this result. Similarly, or by Lemma 3, 
„ ... dx a d X e 

When a vector represented contravariantly, that is, a vector 
expressed in terms of the local base vectors ei, e 2 , e 3 , say 

V a = v^i + v 2 e 2 y 3 e 3 — v a e a 

is expressed in terms of the corresponding local base vectors 
§1, e 2 , e 3 of a new coordinate system, we have, using (22), the new 
representation 




Hence, the components of a eontravariant vector transform ac- 
cording to the law 


Similarly, for a vector represented covariantly, that is, a 
vector expressed in terms of the local reciprocal base vectors, say 

V a = vie 1 + v 2 e 2 + v 3 e z = v a e a 


618 


TENSOR ANALYSIS 


CHAP. 13 


(28) 


we have, using (24), the new representation 



Hence, the components of a covariant vector transform according 
to the law 



Equations (33) and (32), Sec. 13.2, were special cases of Eqs. (27) 
and (28), respectively. 


EXERCISES 


1 


If the range of each index is 3, write out each of the following sums: 
a f(xi) Axi b aijXiXj 

c auxiyi d 


(diX*) 2 


f — 


iXiXx 
dx' dy’ 
~dyi~dx 1 ‘ S ' 


dx' dy’ k 
® dy’ dx k 

If the range of each index is n, show that 


a 5/ 5k’ = 5k' 
c 8j k Aj = Ak 

e Sj'Aijk Si k — Am 


b Si* — V = n 
d Sk^Ak = A k Ak 


3 Write out the proofs of Theorems 1 and 6, Sec. 10.2, using the summation convention rather 
than the S notation. 

4 Write out the proofs of the remaining assertions of Lemma 3. 

6 Prove the following lemma: If <£>,' and 4>u ( i,j = 1, 2, 3) are, respectively, functions of **, x 3 , 
x 3 and x 1 , x 1 , x 3 , then any one of the relations 





dx* dx* 1 


<t>H - 


dx *' dx’ 


dx> 


dx* 

4>a 

a? ~ 

4>ap 

dx i 


dx * 


dx * 


dx* “ 


dxi 

dx' 

dx’ 



lx* 

dx * " 

<t>aB 



implies each of the others. 


Each of the following problems refers to a cylindrical coordinate system. 


6 a What is the differential of arc in cylindrical coordinates? 
b What are G and G~ l in cylindrical coordinates? 

7 a What are the lengths of the local base vectors at (2,0,0)? at (2,0,1)? at (2,ir/3,0)? at 
(2 ,tt/3,1)? 

b What are the lengths of the local reciprocal base vectors at each of the points in part a? 

8 If ei, e 2 , e 3 are the base vectors at the point (2 , jt/ 3,1) and if 

U = + 3es + e 3 and V = ei — e 2 + 2e 3 

what is the length of IT? the length of V? the angle between IT and V? 



SEC. 13.4 


TENSORS 


619 


9 Let V be the vector extending from the point (2,0,1) to (2,x/3,l). Express V in terms of the 
base vectors at (2,0,1) and also in terms of the base vectors at (2,x/3,l). Check the length 
of V, using each of these representations. 

10 Work Exercise 9 using the reciprocal base vectors at (2,0,1) and at (2,*-/3,l). 


13.4 


In the last section, without having referred to them by name, we 
were already working with tensors. Now, with the experience we 
have gained from our discussion of coordinate transformations in 
three dimensions, we can make the matter explicit : 

Let (x^x 2 , . . . ,x”) and (x^x 2 , ... ,x”) be generalized co- 
ordinates in n dimensions, and let the two systems of coordinates 
be related by the transformation equations 

. . T: & — x^x^x 2 , . . . ,x n ) 

(1) ’ 5 i = 1, 2, . . . , n 

T~ x : x 1 = x^x^x 2 , . . . ,x n ) 

Once we pass beyond the three-dimensional space of experience, 
geometric intuition is of little help to us. However, it can be 
shown that in n dimensions, just as in three dimensions, there are 
n parametric curves passing through an arbitrary point and on 
each of these curves one and only one of the generalized coordinates 
varies. Moreover, if the Jacobian determinant of the transforma- 
tion (1) is different from zero, vectors tangent to the parametric 
curves through an arbitrary point can be shown to be linearly 
independent. Hence, if local base vectors e* (i = 1, 2, . . . , w) 
are defined at an arbitrary point P by the conditions that 
dsf — |e* dx l | = sj • Qi Idx 4 ! i not summed 
any vector originating at P can be expressed as a linear combina- 
tion of these vectors. Furthermore, a set of independent local 
reciprocal base vectors e i (i = 1 , 2, . . . , n) can be defined at 
any point P by the same conditions we used in three dimensions, 
namely, 

e l • Qj = 5/ 

and any vector originating at P can also be represented as a linear 
combination of the reciprocal base vectors at P. In fact, all the 
results of the last section, with the exception of those involving 
scalar triple products, are equally correct in n dimensions, pro- 
vided only that the summation convention is understood to cover 
the range 1 to n instead of the range 1 to 3. 


t Just as in three dimensions, we assume that the metrical properties of 
w-dimensional space are defined by a positive-definite differential quadratic 
form (ds) 2 = dx i dx’, i,j = 1, 2, . . . , n, whose matrix G — Hsfoll 
of course, nonsingular. 


620 


TENSOR ANALYSIS 


CHAP. 13 


( 2 ) 


(3) 


(4) 


(5) 


By a scalar, or tensor of rank zero, we mean a quantity S 
whose descriptions in the two coordinate systems are connected 
by the relation 

S(x l ,x 2 , . . . ,**) = S(x 1 ,x 2 , . . . ,x n ) 

By a contravariant vector, or contravariant tensor of rank 1, 
we mean a set of n quantities £*, called components, whose descrip- 
tions in the two coordinate systems are connected by the relations 

| < (r 1 ,x 2 , . . . ,x n ) = %*(x l ,x 2 , . . . ,x n ) ^ i — 1 , 2, . . . , n 

Since dx { = ~ dx a , it follows that the differentials of the coor- 
dinate variables are the components of a contravariant tensor of 
rank 1. 

By a covariant vector, or covariant tensor of rank 1, we mean 
a set of n quantities also called components, whose descriptions 
in the two coordinate systems are connected by the relations 

|,(x 1 ,z 2 , . . . ,x n ) = % a (xi,x 2 , . . . ,x n ) i — 1, 2, . . . , n 


If (j> is a scalar point function [for which, therefore, 
4>i& t t x l , . . . ,x n ) — ^(x 1 ^ 2 , . . . ,x n ) 


then 

dcj> _ d<j> dX* 

W ~ daF &F 

Hence, the n quantities ^ are the components of a covariant 

tensor of rank 1, which we recognize as the gradient of 4>- 
A contravariant tensor of rank 2 is a set of n 2 quantities 
whose descriptions in the two coordinate systems are connected 
by the relations 




s dx l d& 
* dx a dx? 


hi = 1, 2 , 


. , n 


From Eq. (26), Sec. 13.3, it is clear that the elements g ij of the 
matrix G~* form a contravariant tensor of rank 2. 

A covariant tensor of rank 2 is a set of n 2 quantities &,• whose 
descriptions in the two coordinate systems are connected by the 
relations 


( 6 ) 


? „ dx “ dx 13 


i,j — 1, 2, . . . , n 


EC. U.< 


TENSORS 


621 


From Eq. (17), Sec. 13.3, it is clear that the elements (jij of the 
fundamental matrix G form a covariant tensor of rank 2. This 
tensor is often called the fundamental metric tensor. 

A mixed tensor of rank 2 is a set of n 2 quantities fa whose 
descriptions in the two coordinate systems are connected by the 
relations 


’ = b a 


dx { dx$ 
dx a dx> 


Although we shall leave the proof of this fact as an exercise, 5/ is 
an example of a mixed tensor of rank 2. 

A tensor fa (fa) such that fa = fa (fa = fa) for all values of 
i and j is said to be symmetric. A tensor fa (fa) such that 
fa = — fa (fa a, — fa) for all values of i and j is said to be skew- 
symmetric or alternating. 

The concept of a tensor can, clearly, be generalized to include 
tensors of arbitrary rank r with any number k (0 sS k % r) of 
covariant indices and r — k contravariant indices. For instance, a 
set of n 5 quantities fa vw whose descriptions in the two coordinate 
systems are connected by the relations 

Bi = da* dx 1 ' dx? dx 1 * dx* 

Suvw «I7. dx a dx d dx u dx v dx w 


constitutes a mixed tensor of rank 5 with two contravariant 
indices i and j and three co variant indices u, v, and w. 

From the definition of a tensor as a set of quantities which 
transform in a prescribed way, it is clear that a tensor can be 
constructed by specifying its components in one coordinate system 
arbitrarily and then letting the appropriate transformation laws 
define its components in all other coordinate systems. 

The algebra of tensors is based primarily upon the following 
observations; 

Two tensors are equal if and only if they have the same rank 
and the same number of indices of each type and have their cor- 
responding components equal in one and, hence, in all coordinate 
systems. In particular, if the components of a tensor are all zero in 
one coordinate system, they are all zero in every coordinate system. 

If Ti and To are tensors of the same type, then the set of 
quantities obtained by adding respective components of T\ and 
T 2 is a tensor IT + T 2 of the same type as T\ and IV 

If Ti is a tensor of rank r x with Ci contravariant and 71 
covariant indices and if 7T is a tensor of rank r% with c 2 contra- 
variant and 72 co variant indices, then the set of quantities 
obtained by multiplying each component of 7T by each com- 
ponent of T 2 is a tensor 2T7T, called the outer product of IT and 
T 2) of rank r\ + r 2 with cj + c 2 contravariant indices and 71 + 72 
covariant indices. For instance, if IT is the tensor fa and IT is the 



622 


TENSOR ANALYSIS 


CHAP. 13 


( 8 ) 

(9) 


tensor then the general term £i k transforms according to the 
law 


m? = (^~ 


xA / dx k 3xA 


dx l ) 


? s dx a dx 0 dxx dx l 


which shows that TiT 2 = is a tensor, say rtf k , of rank 4 with 
3 contravariant and 1 covariant indices. 

If, in a tensor of any type, a contravariant index is summed 
against a covariant index by simply setting one index equal to the 
other and thereby invoking the summation convention, the result- 
ing set of quantities is a tensor with one less contravariant index 
and one less covariant index. For example, since the tensor 
transforms according to the law 




fl dx i dx 3 ' dx 7 
^ 3x“ 3x3 ds t 


we have, setting j = k, 

w = tct> M. 

* 7 3x" 3x3 d& 

= ~ dp* (by Lemma 1, Sec. 13.3) 


since only when 0 = y is Sp 7 ^ 0. Hence, transforms as a 
contravariant tensor of rank 1; that is, $ is a contravariant 
vector, say if. This process of obtaining one tensor from another 
is known as contraction. Obviously, the process of contraction 
can be repeated as long as there are indices of each type. When 
the process of contraction is applied to the product of two tensors 
in such a way that at each stage one of the two indices belongs to 
the first factor and the other to the second, the resulting tensor 
is said to be the inner product of the two tensors with respect to 
the given set of indices. 

The converse of the last observation is also important : A set 
of n r quantities is a tensor provided an inner product of the set and 
an arbitrary tensor is also a tensor. The proof of this assertion 
should be sufficiently clear from the argument for the special 
case r = 2 : Suppose, then, that %p a is a set of n 2 quantities such 
that, for an arbitrary tensor of the second rank, say rjs 7 , we have 

= T«“ 

where is a tensor. Under an arbitrary transformation of coor- 
dinates we have, of course, 

b a Vd b = U a 


Now, since y/ and fy* are tensors of the indicated type, we have, 
by definition, 


Vd b = t}s fi 


dx b dx s 

dx?m d 


and 




3x a 3x a 
3x* dx A 



SEC. 13.4 


TENSORS 


623 


Hence, substituting into (9) and then using (8), we have 


7-a dx 5 


. a dx a dx s . , dx a dx s 

: & *53 = &W J 


dx“ dx d m dx« dx d 
From this, by transposing and collecting terms, we obtain 


dx s dx b 


3 — 1 ( 3 “ - 




0 


If we now form the inner product of the expression on the left 
with and recall from Lemma 1, Sec. 13.3, that 


dx s dx d 
dx d dx' 


8f 


we have 8 e s 


(+ a dx b „ a dx a \ B 
/~ a d x b „ „ dx a \ . 

t 6, ss “ & W-) r >‘ 


■■ 0 


Now, since y/ is completely arbitrary, it may be chosen, in turn, 
to be a tensor each of whose components except one is equal to 
zero. Hence it follows that the expression in parentheses in the 
last equation must be identically zero. Therefore, 




dx b a dx a 
dx * ** dx a 


Finally, if we form the inner product of each member of this 
Qx & 

equation with and again use Lemma 1, Sec. 13.3, we have 

j a , s 

t „ „ „ dx a dx p „ dx a dx‘ 

01 =b W-M- 

which is precisely the law of transformation for a mixed tensor of 
rank 2. In other words, is a tensor, as asserted. The property 
we have confirmed in this particular case is often referred to as 
the quotient law for tensors. 

The preceding observation is frequently the most effective 
means of proving that a set of quantities is a tensor. For instance, 
by its use we can establish the following interesting result : If the 
elements of a nonsingular matrix ||/,,|| are the components of a co- 
variant tensor of rank 2, then the elements of the inverse matrix \\f ij \\ 
are the components of a contravariant tensor of rank 2. To prove this, 
let £“ be any contravariant vector. Then, by the process of 
contraction, 

Vi = /*«£“ 

is a covariant vector. Moreover, since ||/y|| is nonsingular and 
is arbitrary, 77* is also arbitrary; that is, any covariant vector rji 
can be obtained in this fashion from a suitable contravariant 



TENSOR ANALYSIS 


CHAP. 13 


vector If we now form the inner product of each member of 
the last equation with f pi , we have 

f*m - FfuF 

However, since and j|/ ij ‘|| are inverses, it follows that 

Ffia = SJ 

Hence, 

Since iji is arbitrary, it follows from the quotient law that j ij is 
a tensor, as asserted. 

EXERCISES 

1 Verify that 5/ is a mixed tensor of rank 2. 

2 a Is £i* a tensor? b Is 5, 2 a tensor? 

3 Verify that, if T\ and Tt are tensors of the same type, then T\ ± Tt is also a tensor of that 
type. 

4 Verify that, if T is a tensor and <j> is a scalar, then the set of quantities obtained by multiply- 
ing each component of T by <f> is a tensor of the same type as T. 

8 a Is a tensor obtained if two covariant indices in a tensor are summed against each other? 
b Is a tensor obtained if two contravariant indices in a tensor are summed against each 
other? 

6 If the elements of a nonsingular matrix ||//|| are the components of a mixed tensor of rank 2, 
do the elements of the inverse matrix form a tensor? 

7 a Let be a set of n 2 quantities, and let ifr be an arbitrary contravariant tensor of rank 
2. Show that is a tensor if the product tjp a y &y is a tensor f®?. 

b Show that £p a is a tensor if its inner product with an arbitrary covariant tensor is also a 
tensor. 

8 a Show that f a| 8 is a tensor if its inner product with an arbitrary mixed tensor P is a tensor, 

b Show that is a tensor if the product is a tensor, being an arbitrary covariant 

tensor of rank 2. 

9 If ij*0 is an arbitrary tensor of the indicated type, show that the n 3 quantities form a 
tensor if the product &fy s a p is a tensor f 7 ®. 

10 Show how the contravariant representation of a vector can be obtained from its covariant 
representation, and vice versa. 


13.5 

Divergence and curl 

We have already seen (Sec. 13.4) that, if <f> is a scalar point func- 
tion, then ~ is a eovariant vector, which we recognized as the 

gradient of $. We now turn our attention to the determination of 
the divergence and curl of vectors in generalized coordinates. 

For the divergence of a contravariant vector £“ we have the 
expression 


a) 


SEC. 13.5 


DIVERGENCE AND CURL 


625 


where G is the metric tensor of the space. In rectangular coor- 
dinates, for which 


this is clearly the divergence, as defined in Sec. 12.3. However, 
before this can be accepted as a definition of the divergence in any 
coordinate system we must prove that it is a scalar invariant; 
that is, that it is the same in all coordinate systems. To do this, 
let us consider the given expression in a second coordinate system: 

1 d\G\dx‘ d? 

2\G\ dx a d& * d& 

By hypothesis, is a contravariant vector. Hence, 


= (Vl<?l « 




|J| dx a ? d& dx a ~ ? c 


Also, from Eq. (19), Sec. 13.3, |(r| = |(?| • |/| -2 , where J is the 
Jacobian of the transformation. Therefore, by differentiating and 
dividing by |<5|, we obtain 

i a|G| _ l a|g| 2 a|J| 

\G\ dx a ~ |<7| dx a \J\ dx“ 

Thus, substituting from Eqs. (3), (4), and (5) into Eq. (2), we 
have 

_4= — (VW\ — - — — ^ €- + — (t~) 

V\G\ W 1 1 2 \\G\ dx* \J\ cte“/ dx 1 \ dx*J 

i a[(?[ l a|j|' d? d& a«# ax 6 
2\G\ dx a 5 |J| 5 + dx i dx a + * dx b dx* d& 

, ( j_ m *, + *i \ , (_^l 4/i\ *. 

\2|G| dx a K ^ d3f/ T \dx a dx» d& |/| dx*J 
Now, the first quantity in parentheses in the last expression is 
precisely 

vpj (v,|<?l w 

Hence, our proof will be complete if we can show that the second 
quantity in parentheses is zero. To do this, we recall from the 
rule for differentiating determinants that the derivative of \J\ is 
equal to the sum of n determinants, the ith one of which is 
identical with |/| except that the ith row consists of the deriva- 
tives of the elements in the zth row of \J\. Hence, if we denote by 


624 


TENSOR ANALYSIS 


CHAP. 13 


vector If we now form the inner product of each member of 
the last equation with fP*, we have 

f*v = 

However, since J|/^]| and ||/ y || are inverses, it follow's that 

f% = 5 / 

Hence, = 3j^ a = ^ 

Since is arbitrary, it follows from the quotient law that /»•? is 
a tensor, as asserted. 

EXERCISES 

1 Verify that 3/ is a mixed tensor of rank 2. 

2 a Is 3i* a tensor? b Is a, 2 a tensor? 

S Verify that, if 7\ and Tt are tensors of the same type, then T\ ± Tt is also a tensor of that 
type. 

4 Verify that, if IT is a tensor and 4> is a scalar, then the set of quantities obtained by multiply- 
ing each component of T by is a tensor of the same type as T. 

8 a Is a tensor obtained if two covariant indices in a tensor are summed against each other? 
b Is a tensor obtained if two contravariant indices in a tensor are summed against each 
other? 

8 If the elements of a nonsingular matrix ]|//|| are the components of a mixed tensor of rank 2, 
do the elements of the inverse matrix form a tensor? 

7 a Let £/ 9“ be a set of n 2 quantities, and let he an arbitrary contravariant tensor of rank 

2. Show that ty" is a tensor if the product is a tensor 

b Show that is a tensor if its inner product with an arbitrary covariant tensor is also a 
tensor. 

8 a Show that (j*/j is a tensor if its inner product with an arbitrary mixed tensor ij & is a tensor . 

b Show that is a tensor if the product is a tensor, ^ being an arbitrary covariant 

tensor of rank 2. 

9 If 17* £ is an arbitrary tensor of the indicated type, show that the n 3 quantities form a 

tensor if the product is a tensor 

10 Show how the contravariant representation of a vector can be obtained from its covariant 
representation, and vice versa. 


13.5 

Divergence and curl 

We have already seen (See. 13.4) that, if <f> is a scalar point func- 
tion, then ~ is a covariant vector, which we recognized as the 

gradient of We now turn our attention to the determination of 
the divergence and curl of vectors in generalized coordinates. 

For the divergence of a contravariant vector £“ we have the 
expression 

■ 


a) 


SEC. 13.5 


DIVERGENCE AND CURL 


625 


( 2 ) 

(3) 

(4) 


(5) 


( 6 ) 


where G is the metric tensor of the space. In rectangular coor- 
dinates, for which 



1 

0 

0 

G = 

0 

1 

0 


0 

0 

1 


this is clearly the divergence, as defined in Sec. 12.3. However, 
before this can be accepted as a definition of the divergence in any 
coordinate system we must prove that it is a scalar invariant; 
that is, that it is the same in all coordinate systems. To do this, 
let us consider the given expression in a second coordinate system: 


ff| ^ V\G\ \2 V\G\ dz' 11 dx'J 


V\G\ M 


' V\G\ \2 V\G\ a** ' 

1 d\G\ dx a '.ag 

2\G\ dz a dx * dx* 


By hypothesis, £“ is a contravariant vector. Hence, 


Also, from Eq. (19), Sec. 13.3, \Q\ = \G\ ■ \J\~ 2 , where J is the 
Jacobian of the transformation. Therefore, by differentiating and 
dividing by \G\, we obtain 
1 d\G\ _ 1 a|g| 2 d\J\ 

|0| dx a |(?| dx* \J\ dx a 


Thus, substituting from Eqs. (3), (4), and (5) into Eq. (2), we 
have 


__1 d _ 

v\o\ 


(V\G\ |‘) - 


2 \\G\ dx? 


\J\ dx a ) 





' 2\G\ dx a 5 


|/| dx a 


V2|<?| dx“ k ^ dx?)^\< 


3 | a _d& d 2 x { dtf 

dx * dx a dx h dx? dx' 

d 2 x* dxfi _ aJJ|\ 
dx a dx b dx i \J\dx°) 


Now, the first quantity in parentheses in the last expression is 
precisely 


VlB| ^ (VI<?I ” 

Hence, our proof will be complete if we can show that the second 
quantity in parentheses is zero. To do this, we recall from the 
rule for differentiating determinants that the derivative of \J \ 18 
equal to the sum of n determinants, the ith one of which is 
identical with |/| except that the ith row consists of the deriva- 
tives of the elements in the iih row of |J|. Hence, if we denote by 



626 


TENSOR ANALYSIS 


CHAP. 13 


(7) 


(8) 


Ji b the cofactor of the element in the fth row and 5th column of 
\J\, then, in expanded form, 

W_ dW 
da? da? dx b ' 

However, by Corollary 1, Theorem 1, Sec. 13.3, 

Hence 

nence, ^ ^ d%a Qxb e - i 

and, thus, the second quantity in parentheses in (6) is indeed zero ; 
and the scalar 
__1 d_ 

VW\ dx ° 

is invariant and, hence, equal to the divergence in every coordinate 
system. 

If we use the covariant representation | a instead of the con- 
travariant representation £“, then, since 

= g*b 

(see Exercise 10, Sec. 13.4), we have for the divergence 
1 d 


;(V|G|« 


VI a\i 


;(VIG|rt) 


If £ a is the gradient of a scalar point function, that is, if 
is the covariant vector then its divergence is called the 
Laplacian of <j>. In other words, in generalized coordinates, 


Obtain the expression for in cylindrical coordinates. 

By direct calculation, as in Example 1, Sec. 13.3, or by observing from a figure that 
(ds) s = (dr)* + (r do)* + ( dz )* 
we find that, in cylindrical coordinates, 


G - Hffo-ll - 


Therefore, from (8), 
W 


1 

0 

0 

111 0 

0 

0 

r 2 

0 

§ 

o. 

11 

11 

o 

0 

0 

0 

1 

II 0 0 

1 


-if 1 / r !i\ + ±( r ±*+\ 

r [9r \ dr J d6\ r* d0 ) 

«l[Y r £!* + £*\+I.3!* + . 

r[\ dr * dr r 30* ^ 

= mv , i <n> 

dr a + r* 30 s + dz* r dr 


(*)] 


SEC. 13.5 


DIVERGENCE AND CURL 


( 9 ) 


Consider, now, an arbitrary covariant vector £ 0 . From its 
law of transformation 


we have, by differentiation, 
dj a _ 6£ a dx b dx° . r d 2 x? 
dx 0 dx b dx# dx a dx& dx a 


Similarly, of course, 

dl $ _ d& dx° dx b , d 2 x b 

dx a dx a dx a dx# . dx a dx# 

Hence, subtracting the last two equations, we have 
d|« _ dh = (SjU __ dx° 
dx& dx a \<5rc 6 dx a J dx a dx& 

where the other terms cancel, since the order of partial differentia- 
tion is immaterial and since a and b are just dummy indices. From 
the law of transformation defined by the last equation, it follows 
that 

dx b dx a 

is a covariant tensor of the second rank. Clearly, it is a generaliza- 
tion of the familiar notion of the curl of a vector. 

More specifically, in three dimensions, let be an arbitrary 
alternating covariant tensor of the second rank, for which, by 
definition, £*■**— & 0 . From £<* we can construct the expression 

xe l = £126* x e 2 -f i^e 2 X e 3 4- £ 3 i6 3 X e 1 
Moreover, if we use the fact (see Exercise 1, below) that 
e* x e J ' = .. . any cyclic permutation of 1,2,3 

we can write ^>£ a &e“ x e 6 in the form 


^ 12 [©16263] [616263] ^ 31 [e^es] 

from which it is clear that 4££ a i,e“ x e 6 is a contravariant tensor 
of rank 1. Finally, if we recall from Eq. (12), Sec. 13.3, that 
[eie 2 e 3 ] 2 = \G\ 

we can write this tensor in the form 

?23 „ | %31 _ , £lg a 


©1 H 


VlG| e!+ -VlGl' 


-vrn 

If %ab = where £„ is a covariant vector, then the expres- 

sion (9), with the negative square root used, as indicated, is 
precisely the curl of as we defined it in Chap. 12, Sec. 12.3. 



628 


TENSOR ANALYSIS 


CHAP. 13 


EXERCISES 

1 Using Eqs. (9) and (10), Sec. 13.3, or otherwise, show that 

e 2 X e 3 = - — - e'Xe‘= 6a — e‘Xe' = ■ — " i 

[eieje 3 ] [enseal [eie 2 e3j 

2 a What is the divergence of a contravariant vector in cylindrical coordinates? 
b What is the divergence of a covariant vector in cylindrical coordinates? 

8 a What is the divergence of a contravariant vector in spherical coordinates? 

b What is the divergence of a covariant vector in spherical coordinates? 

4 Obtain the expression for V 3 <£ in spherical coordinates. 

6 If is the gradient of a scalar function <j>, show that the curl of £ a vanishes identically. 


13.6 

Covariant differentiation 

Since the components of a tensor are functions of the generalized 
coordinates, it is obvious that they can be differentiated partially 
with respect to the coordinate variables. However, the quantities 
thus obtained are of no intrinsic interest since they are not the 
components of a tensor. For instance, if £ d is a contravariant 
vector and if we differentiate the transformation equation 


5 - ? 




(1) 


partially with respect to x ‘ 8 , we have 


a ? 5 _ d? ax 6 dx l a 2 x s ax 6 

ax 6 ax' 3 dx d + 5 dx b dx d a#* 

d£ d 

Clearly, if the second term on the right were not present, ^ 

would be a mixed tensor of rank 2, since the first term on the 
right in (1) is precisely what is given by the law of transformation 
for a tensor with one covariant and one contravariant index. It is 
also interesting to note that, if the equations connecting the two 
sets of generalized coordinates are linear, as they are for trans- 
formations between rectangular and oblique coordinates, then 
the second term is missing. These observations raise the impor- 
tant question of whether or not it is possible to add “correction” 

terms C b d to the partial derivatives ~ h so that 


a + 


will be a mixed tensor of rank 2. This is indeed possible, and, 
although we cannot go deeply into the matter, we shall determine 
the appropriate “correction” terms and define the so-called 
covariant derivative. 


COVARIANT DIFFERENTIATION 


From Eq. (1) it is almost obvious that the terms to be added 
t° to eliminate the second sum should be linear in the ij’s, say, 


( 2 ) 


(3) 

(4) 


(5) 

( 6 ) 


(7) 


C b d = T**? 


and this is actually the case. To determine the coefficient function 
T ab d , we begin with the metric tensor g ah . From its law of trans- 
formation, namely, 


{ lat 3 — Qab 


dx a dx b 

W a 


by differentiating each side with respect to x y , we obtain 
dff af > _ dg ab dx c dx° d& f d 2 x a ( dx a d 2 x b } 

d$y dx c dxv dx a d& + [ 9ab dxy dx a d&l + P 06 dx a dz:< 3x s ) 

From this, by first, interchanging and 7 and then interchanging 
y and a, and making the corresponding permutations of the 
dummy indices a, b, c in the first term, we obtain, respectively, 

dg*-t _ dg a c dx b dx a dx c d-x a dx b , f dx? d 2 x b 1 

dx p ~ dx b dx p dx a dFy + Sab d& dl“ dSX + \ g,A W a W~d¥ J 
dg 7 g _ dffcb dx a dx c dx b [* d 2 x a . dx a d 2 x b 

dx a ~ dx a dx a dxy dx* + [ 9ab dx a d^ &Pj Qab dSP dx a d& 

Now, subtracting (2) from the sum of (3) and (4), noting that the 
quantities in brackets and the quantities in braces cancel respec- 
tively, we obtain 

dg 7 0 , dgay __ dgtf _ / dgub , dg*c _ dg ab \ dx? dx b dx c 

dx a dx & dxy \dx a dx b dx e ) bx? doty 

d 2 x a dx b dx a d 2 x b 
+ 9ab dx* dx a dxy + 9ab dx a dx» 


Finally, interchanging the dummy indices a and b in the last term 
and recalling that g ba — g a b, we have 

dffn , dg«y _ dga ,p _ (dga , dg*c _ dg ab \ d& dot? dx° d' l x a dx b 

dx a dxfi dxy dxh dx c ) dx a dxfi dxy ^ 9ab dx a dx s d£y 

The quantities 

-p 1 ( dg c i 1 | dg ac dg ab \ 

c ' ab ~ 2 + dx b dx* ) 

whose law of transformation is given by Eq. (5), are known as 
Christoffel symbols of the first kind.* Incidentally, because of 
the second term on the right in the transformation equation (5), 
it is clear that r c>a & is not a tensor. 

The Christoffel symbols of the second kind are, by definition, 
the quantities 
Tab d = g da T c , ab 


Named for the German mathematician E. B. Christoffel (1829-1900). 


630 


TENSOR ANALYSIS 


CHAP. 13 


To obtain their law of transformation, we begin by recalling from 
Eq. (26), Sec, 13.3, that 


'dgyp 1 
^dx“ 

d Say 

_ dg«g V 

dx 8 

dx 7 1 

(dg c b 

dg ae 


Vdx° + 

dx b 

_ 

dx c j 


, n d 2 x° dx*| 
+ Z9ab dx" dx 8 dx 7 J 

_ 1 di f dg& , dgac __ dgab\ dx a dx 6 dx 5 T dx c dx 7 ] 

“ 2 9 \JW + Sx 5 dx c ) dx" dx? dx d [dx 7 dx { J 

A; d 2 x° dx 5 1" dx 6 dx 7 ] 
+ 9 lgai dx" dx^ dx d [dx 7 dx* J 

Now, by Lemma 1, Sec. 13.3, the bracketed terms become 
dx e dx 7 , , , dx 5 dx 7 s h 

and W^‘ =sf 


d&dx‘ =S< ‘ and = 

Therefore, the last equation simplifies to 

r j « + * 95 * _ «?*r 0 *L + ( ^a b dHa — 

** 2 9 Vdx“ + dx* dx® / dx" dx* dx d + 3 ^ dx" dx* dx d 

Furthermore, since \\g'’\\ and ||p,y|| are inverse matrices, it follows 
that 

g di 9ab - g ib gba = s a d 

Hence, the last term in the preceding equation reduces to 

d 2 x a dx 5 
dx" dx 8 dx° 

and we have, finally, the law of transformation 

r a)3 5 = I gdc q- d2x ° d ' fS 


\ dx“ dx 6 dx c / dx" dx 8 dx d dx" dx* dx c 
dx° dx* dx 5 d 2 x“ dx s 
c ' ab d&W 8 dX d + dx" dx* dx“ 

— r d ^x 6 dx s 4 . ^ 2 x a 

“ 6 dx" dx* dx d dx" dx 5 dx" 


Because of the second term on the right in ( 8 ), it is clear that Tab d , 
like Te.db, is not a tensor. 

d£ d 

We can now establish the fundamental result that ^ + r a i, d £“ 
is a mixed tensor of rank 2 In fact, knowing the law Of transforma- 

,9 

tion for namely, Eq. (1), and the law of transformation for 


SEC. 13.6 


COVARIANT DIFFERENTIATION 


(9) 


( 10 ) 


( 11 ) 


XV, namely, Eq. (8), we have 

dx? ' 1 * A-r-b ' « 


‘ dx b dx? dx d 

+ 


d 2 x s dx b 
dx^ dx d dx$ 
dx s 


( r 

V dx* d& 


3V dx 6 \ . dx c 


dx d r dx a dxP dx a ) 5 dx i 


or, replacing the dummy index d by i in the second term, replacing 
the dummy index a by 6 in the fourth term, and observing that in 
, , . , • , , dx a dx a 

the third term -rr- -r— . = 8 * 
dx a dx l ’ 


a? , - 
& + T « 


= e 2 x s 

€ dx h dx* dx d + ? dx b dx i dx? 


, tiv d*« dxb d* s , H a»s» Sx s dx° 
~ i ~ * 06 1 d& dx d "*■ * dx a d& dx b dx i 


\dx b + 5 la * / d#» 3x d 


3x 5 3x 6 


T 3 2 x s 3x 6 
+ 5 [dx b dx l dx B 


dx a d& dx b dx l I 


Now, j—# = 8/} s . Hence, differentiating with respect to x\ we 
have 

d 2 x s dx b , dx s d 2 x b dx a 


dx b , dx^ 

dx l dx b dx B dx h dx a dx B dx i 


■ 0 


§ + rvi“ 


5 + ivs° 


Therefore, the expression in brackets in Eq. (9) is equal to zero, 
and we have 
3| 
dx B ' 

which proves that 
di d 

a? + r - i£ ‘ 

is a mixed tensor of the second rank, as asserted. 

The expression (10) is called the covariant derivative of the 
contravariant vector £*, and is frequently denoted by the symbol 

Dk d 

3x 6 

In very much the same way it can be shown that, if %d is a 
covariant vector, then 


d£d 
dx b ' 




is a mixed tensor of rank 2. This expression is known as the 
covariant derivative of the covariant vector £<j, and is denoted by 
the symbol 

Djk 

dx b 


TENSOR ANALYSIS 


CHAP. 13 


It can also be shown that any tensor has a covariant deriva- 
tive, in which a term like the second term in (10) enters for each 
eontravariant index in the tensor and a term like the second term 
in (11) for each covariant index. Thus, for tensors of the second 
rank, we have the formulas 

tet = 45 + r» d e + IV?" 


1 a Show that r*ai = rj.jp. b Show that T a i, d = ri„ rf . 

c Show that Taoi + Y a ,bd - 

ax° 

d Show that a necessary and sufficient condition that the Christoffel symbols all be zero is 
that the ga’s be constants. 

2 a Calculate the Christoffel symbols for a cylindrical coordinate system, 
b Calculate the Christoffel symbols for a spherical coordinate system. 

3 If $ is a scalar function and £ d is a eontravariant vector, show that 

dx> dx b v dx h 

4 Prove that ~~ - 0. 

dx k 

. „ ...W?) Dr , M Du 


CHAPTER FOURTEEN 


Analytic Functions 
of a 

Comp l ex Variable 


14.1 

Introduction In our work up to this point we have frequently found the use of 
complex numbers necessary or at least convenient. For instance, 
we encountered them in the solution of linear differential equa- 
tions with constant coefficients in Chap. 2. In Chap. 5 they 
appeared in the complex impedance, which we found of consider- 
able utility in the determination of the steady-state behavior of 
electrical circuits. Then, in Chap. 6, their use led to the important 
complex exponential form of Fourier series and ultimately to the 
inversion integral of Laplace transform theory. Finally, in Chap. 
9, we found that certain important physical problems required 
the consideration of Bessel functions of complex arguments. 

None of these applications, with the exception of the inver- - 
sion integral, for which fortunately we had no immediate need, 
required any knowledge of the properties of complex numbers or 
of functions of a complex variable beyond what is ordinarily 
acquired in courses in college algebra and calculus. There are, 
however, large areas of applied mathematics in which familiarity 
with the theory of functions of a complex variable beyond this 
minimum is indispensable. In this and the next three chapters 
we shall develop the major features of this theory and illustrate 
some of its more striking applications. 


14.2 

Algebraic preliminaries 

By a complex number we mean a number of the form 
z = x + iy 

where x and y are real numbers and i is the so-called imaginary 



634 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


unit whose existence is postulated such that i 2 = —1. The real 
number x is called the real component or real part of z. The real 
number y is called the imaginary component or imaginary part 
of z. The real and imaginary parts of a complex number or 
expression z are often denoted by the respective symbols 

(R(z) and S(z) 

It is important to keep in mind that d(z), as here defined, is a 
real quantity. 

Two complex numbers a + ib and c + id are said to be 
equal if and only if the real and imaginary parts of the first are, 
respectively, equal to the real and imaginary parts of the second. 
In particular, the vanishing of a complex number implies not 
one but two conditions, namely, that both the real part and the 
imaginary part of the given number are zero. 

EXAMPLE i 

If (* + y + 2) + (*» + V)i - o 

then x + y + 2 = 0 and also x 2 + y - 0 

From this pair of simultaneous equations it follows necessarily that 

x =» 2 and y = —4 or x = — 1 and y = — 1 

If z — x + iy, then the negative of z is the complex number 


If two complex numbers differ only in the sign of their imaginary 
parts, either one is said to be the conjugate of the other. The con- 
jugate of a complex number z is usually written i or, less fre- 
quently, z*. 

Addition, subtraction, and multiplication of complex numbers 
follow the familiar rules for real quantities, with the additional 
provision that in multiplication all powers of i are to be reduced 
as far as possible by applying the definitive property of i and 
its obvious extensions: 

1 2 = -1 

1 3 — iH — — i 
i* = i 2 i 2 = 1 
f 6 = m — i 

Thus (a + ib ) + (c + id) = (a + c) + (& i d)i 
and (a -f ib) (c + id) = (ac — bd) + (be + ad)i 

Division of complex numbers is defined as the inverse of 
multiplication; that is, (a + ib)/(c + id) is the complex number 
z = x + iy which satisfies the equation (c + id) (x + iy) — a + ib. 
Performing the indicated multiplication, we find 
(cx — dy) + (dx + cy)i = a + ib 


SEC. 14.2 


ALGEBRAIC PRELIMINARIES 


635 


(1) 


(2) 


(3) 


(4) 

(5) 

( 6 ) 


which implies that 

cx — dy — a and dx + cy = b 
Solving these for x and y, we obtain 
_ ac + bd , be — ad 


Hence. 


c°~ + d 2 

a, -|- ib 
c 4- id 


and 


c 2 + d 2 

ac -j-bd, bc^— a 


c 2 + d 2 c 2 + d 2 4 


In practice, the quotient of two complex numbers is usually 
found by multiplying both numerator and denominator by the 
conjugate of the denominator: 

a + ib a + ib c — id ac + bd , be — ad , , . , _ „ 

r+ld~cTTd‘7^1d = -^T& + ^ r Td* t c + ld *° 


Conjugate complex numbers have various simple though 
important properties. For instance, if z = x iy, then 

zl = (as + iy) (x — iy) - x 2 + y* 

which is a purely real quantity. This is the basis for the use of 
conjugates in division. Also, 

z + z — (x + iy) -f (x — iy) — lx — 2 (R(z) 
or 

^ 

and z — l — (x + iy) — (x — iy) = 2 iy — 2 id(z) 


or 



In taking the conjugate of a complicated expression, the 
following results are of great utility: 

Zl ± Z 2 — Zl ± Z 2 


Z1Z2 = Z1Z2 



The proofs of these all follow immediately from the four laws 
of operation and the definition of conjugates. 


EXERCISES 

1 Prove that, if a number is equal to its conjugate, it is necessarily real. 

2 Prove that any number is equal to the conjugate of its conjugate. 

3 Prove that, if the product of two complex numbers is zero, at least one of the numbers must 



636 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


Reduce each of the following expressions to the form o + *6: 


4 (1 - i)* + (2 + iy 
6 i( 2 + 3 i)* 

8 I ±1 

(3 — i) (1 — i) 


5 (1 - 2i)(3 + 2i) 3 

7 l+i 1 ~ i 

1 — i l+i 

9 --■■<!+£ 

(2 + t)(l + 2 i) 


10 Verify that z = (1 ± i •%/?!) /2 satisfies the equation z- — z + 1 =0. 

11 Show that, for all combinations of signs, z — (±1 ±i)/\/ 2 satisfies the equation 
s 4 + 1 - 0. 

12 What is ■«(** - 2z)? a(z s - 2a)? 

13 If F(z) is a polynomial in z with real coefficients and F( 2 + 3i) => 1 — i, what is F(2 — 3i)? 
Is F(a — ib) determined by a knowledge of F(a + ib) if the coefficients of F(z) are not all 
real? 

14 If BS > (/l + A)(C + €'), show that the equation 

(A + A)zz + Bz + Sz + {C + C) - 0 
represents a real circle, and find its center and radius. 

16 Solve the equation (x a y — 2) + (x + 2xy — 5)i = 0 for x and y. 


14.3 

The geometric representation of complex numbers 

A complex number is represented geometrically either by the 
point P whose abscissa and ordinate are, respectively, the real 
and imaginary components of the given number or by the 
directed line segment, or vector, which joins the origin to this 
point. When used in this fashion for representing complex 
numbers, the cartesian plane is referred to as the argand diagram* 
or the complex plane or simply as the 2 -plane. 

The vector OP which represents the complex number x +• iy 
possesses two important attributes besides its components x 
and y. These are its length 

(1) r = \/x 8 + ?/* 

and its direction angle 

(2) B = tan -1 -^ 

x 

Since (Fig. 14.1) x = r cos 6 and y — r sin 6, it is evident that 
x + iy can be written in the equivalent form 

(3) z — r cos 6 + ir sin 8 ~ r( cos 6 -+ i sin 6) 


* Named for the French mathematician J. R. Argand (1768-1822), although 
the Norwegian Caspar Wessel (1745-1818) published a discussion of this 
method of representation nine years before Argand did. 

t Actually, tan" 1 ( y/x ) defines two sets of angles in opposite quadrants, 
the angles of one set equaling the angle of z, the others not. Hence, in using 
the formula 6 — tan -1 {y/x) one must be careful to select the angles in the. 
proper quadrant, as determined by the signs of x and y. 


SEC. 14.3 


THE GEOMETRIC REPRESENTATION OF COMPLEX NUMBERS 


637 


FIGURE 14.1 
The modulus r, the 
amplitude 0, and 
the components x 
and y of the com- 
plex number 
z - x + iy. 

This is known as the polar or trigonometric form of a complex 
number and is sometimes abbreviated to 
r cis 0 

in which only the initial letters of cosine and sine are retained. 
The length r is called the absolute value or modulus of z (written 
mod z). The angle 0 is called the amplitude or argument of z 
(written arg z). 

The various combinations of complex numbers we have thus 
far discussed can easily be interpreted geometrically. For instance, 
Fig. 14.2 shows that the negative of a complex number is the 
reflection of that number* in the origin, while the conjugate of a 
complex number is the reflection of that number in the real axis. 
The geometrical addition of complex numbers is shown in Fig. 
14.3a. By drawing one complex number from the terminus of 




638 ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE CHAP. 74 

the other and completing the triangle thus formed, a third com- 
plex number is determined whose components are precisely those 
of the sum zi + z%. Figure 14.36 shows the construction for the 
difference of two complex numbers, i.e., for the sum z\ + (— 22 ). 
Evidently, Z\ — z% is identical in length and direction with the 
vector drawn from the end of z 2 to the end of z\. 

Both the sum and the difference of two complex numbers 
can be described in terms of the parallelogram having the given 
numbers for adjacent sides; for the sum is simply the diagonal 
of the parallelogram which passes through the common origin 
of the two vectors, and the difference is just the other diagonal, 
properly directed. Much of the utility of complex numbers in 
elementary engineering applications stems from the fact that 
they add according to the parallelogram law. Since this is the 
experimentally established law for the addition of such things as 
forces, velocities, currents, and voltages, it is evident that in 
two dimensions complex numbers, like ordinary vectors in three 
dimensions, can conveniently be used to represent such quantities. 

Although we shall have no occasion to use it, a graphical 
process for multiplying and dividing complex numbers can also 
be devised. It is based upon the following exceedingly important 
considerations. If we have two complex numbers given in polar 
form, their product can be w-ritten 

«i 22 - [n(cos 0i -F i sin 0i)][r 2 (cos 0 2 + i sin 0 2 )] 

= rir 2 [(cos 0i cos 0 2 — sin 0i sin 0 2 ) + i { sin 0i cos 0 2 + cos 0i sin 0 2 )| 

(4) = ri?’ 2 [cos (0i + 0 2 ) + i sin (0i -f 0 2 )] 

and their quotient can be written 

z\ _ ri( cos 0i + i sin 0Q 
Z 2 r 2 (eos 0 2 + i sin 0 2 ) 

_ rifcos 0i + i sin 0i)(cos 0 2 — i sin 0 2 ) 
r 2 (cos 02 + i sin 0 2 )(cos 0 2 — i sin 0 2 ) 

_ r_i , (cos 0i cos 0 2 + sin 0i sin 0 2 ) + f(sin 0i cos 0 2 — cos 0i sin 0 2 ) 
r 2 cos 2 0 2 + sin 2 0 2 

(5) = ~ [cos (0i — 0 2 ) -f- i sin (0i — 0 2 )] r 2 ^ 0 

In words, then, the product of two complex numbers is a complex 
number whose absolute value is the product of the absolute values 
of the two factors and whose amplitude is the sum of the amplitudes 
of the two factors, and the quotient of two complex numbers is a 
complex number whose absolute value is the quotient of the absohite 
values of the numbers and whose amplitude is the difference of their 
amplitudes. The behavior of the angles of complex numbers when 
the numbers are multiplied or divided is concisely expressed by 


SEC. 14.3 


THE GEOMETRIC REPRESENTATION OF COMPLEX NUMBERS 


639 


the formulas 

(6) arg z x z 2 = arg z i + arg z 2 

(7) arg J = arg z i - arg z 2 

*2 

In Sec. 14.7, when we succeed in writing a general complex num- 
ber as an exponential, the reason for the striking resemblance 
of these results to the corresponding logarithmic formulas will 
become apparent. 

The extension of these ideas to products of more than two 
factors is obvious, and we can write at once 
z x z 2 • • ■ z n = r x r 2 • ■ • r„[cos (0i + 02 4~ ■ ■ • + 0„) 

4- i sin (0i + 02 + • • ■ + 0 B )] 
In particular, if all the z’s are the same, we have the important 
result 

(8) z n — r"(cos nd 4- i sin n&) 

If r = 1, this is known as de Moivre’s theorem.* Since the law of 
division in polar form (5) gives 

- = - [cos (0 — 0) 4- i sin (0 — 0)] = - [cos ( — 0) 4 -i sin (—0)1 
z r r 

which is Just the content of Eq. (8) for n = — 1, it is clear that 
this formula is valid for all integral values of n, both positive 
and negative. 

The extension of Eq. (8) to roots of integral order is an easy 
matter. In fact, an nth root of z — r ( cos 0 4 - i sin 0) is defined 
as any number w — R ( cos 4> + i sin <f>) such that 

w n = E"(cos n<f> 4- i sin n<£) = z = r(cos 0 4~ i sin 0) 

Since two complex numbers which are equal must have the same 
modulus, it follows that 

R n = r or R = r lln 

It should be noted that only real numbers are involved in the 
determination of R, since r 1,n is the real nth root of the positive 
quantity r and can be found by an ordinary logarithmic calcula- 
tion. Also, the angles of equal .complex numbers must either be 
equal or differ at most by an integral multiple of 2 x. Hence, 

n<f> = 0 4- 2/C7T or <f> — -- - -- 
n 

For k — 0, 1, . . . , n — 1, these values of <f> define distinct 
angles; after this the same angles are repeated, again and again, 
each time with an irrelevant increment of 2ir in their measures. 


* Named for the French mathematician Abraham de Moivre (1667-1754), 
although an equivalent form had been obtained earlier by the Englishman 
Roger Cotes (1682 r 1716). 


640 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 114 


Thus there are exactly n distinct values of w = z 1/n : 

(9) w — z lln — r lln ^cos ~ + l sin 

k = 0 , 1 , . . . , n — 1 
In the complex plane these are represented by radii of the circle 
with center at the origin and radius r 1,n , spaced at equal angular 
intervals of 2ir/n from the radius whose angle is Ofn. 

With integral powers and roots defined, the general rational 
power of a complex number can be defined at once. In fact, 

z pIq = ( z llq )r = r 11 * ^cos - + i sin^~^-^j 

(10) = r p/ « I cos - (0 + 2 hr) + i sin - (0 + 2kr) 

l Q Q 

k = 0, 1, . . . , q - 1 
The definition of z* when a is not a rational number, however, 
must be postponed until Sec. 14.7. 

EXAMPLE 1 

Find the four fourth roots of —8i. 

To do this, we must first write —Si in standard polar form: 
r. • _ / 3f , . 3 j r 

“ 8 '" 8 V°“T +,sm T 

From this, by applying Eq. (9), we find that the four fourth roots of — 8i are given by the 
expression 

8’4 cos — -(- 2kir^ + i sin — 4" 2kir^ J k = 0, 1, 2, 3 

or, explicitly, r, = 8^ ^cos ~ 4- i sin — ^ k = 0 

r t =» gM ^ cos — + i sin k = 1 

r 3 « 8H ^cos — 4- i sin k =2 

r< = 8** ^cos + i sin k = 3 

The coefficient 8^ is, of course, the real fourth root of 8, the value of which is found by a simple 

logarithmic calculation to be 1.682. 

EXAMPLE 2 

Using de Moivre’s theorem and the binomial expansion, express cos 40 and sin 40 in terms of 
powers of cos 0 and sin 0. 

To do this we consider (cos 0 4i sin 0)* and expand it first by de Moivre’s theorem and then 
by the binomial theorem. This gives the identity 

cos 40 + i sin 40 = cos 4 0 4- 4 i cos 3 0 sin 0 4- 6t* cos* 0 sin* 0 4- 4i* cos 0 sin 3 0 + i* sin 4 0 
= (cos 4 0 — 6 cos* 0 sin* 0 4- sin 4 0) 4- 1(4 cos 3 0 sin 0 — 4 cos 0 sin 3 0) 


0 4~ 2 kir \ 
n ) 



SEC. 14.4 


ABSOLUTE VALUES 


641 


Equating real and imaginary parts of these equal complex expressions, we obtain the required 

formulas : 

cos 40 = cos 4 0 — 6 cos 2 0 sin 2 0 + sin 4 0 
sin 40 = 4(cos 3 0 sin 0 — cos 0 sin 3 0). 

EXERCISES 

1 Show that multiplying a complex number by i rotates it through 90° without changing its 
length. What is the effect of multiplying a complex number (a) by — i? (b) by \/ it 

2 A square lies entirely in the second quadrant. If one of its sides joins the points —3 and 2 i, 
find the coordinates of the other two vertices. 

3 Find all the distinct fourth roots of —1. 

4 Find all the distinct fifth roots of 32. 

5 Find all the distinct cube roots of i. 

6 Express the complex number 8 — 8 \/B i in polar form, and find its distinct fourth roots. 

7 Find the distinct cube roots of 1 + i, and reduce each to the form a + ib, where a and b are 
decimal fractions. 

8 Find all the distinct values of (1 — £)?*. 

9 Find all the distinct values of ( — 1 — 

10 Using de Moivre’s theorem, express cos 50 and sin 50 in terms of powers of cos 0 and sin 0. 

11 Show that, if n is an integer, both cos nO and (sin nO) /(sin 0) can be expressed as polynomials 
in cos 0. 

12 If zi and z 2 are complex numbers, what point is represented by (zi + zs)/ 2? What is the 
locus of the points Xz 2 + pzs, where X and p. are real parameters and X + a = 1? 

13 Show that the centroid of a system of three particles of equal mass situated at the points 
zi, zi, zs is the point (zi + z 2 + z 3 )/3. Where is the centroid of a system of three masses 
mi, ms, and m a situated respectively at the points zi, z 2 , and z 3 ? 

14 Using the polar form of the multiplication law, devise a geometrical construction for the 
product of two complex numbers. 

15 Devise a geometrical construction for the quotient of two complex numbers. 


14.4 

Absolute values 

We have already defined the absolute value of a complex number 
z as the length of its representative vector; i.e., 

1^1 = -\/ x* + y 2 — ■%/ (R 2 (z) + £ 2 (z) 

From this it is evident that a complex number is zero if and only 
if its absolute value is zero. Since (R 2 (z) and $ 2 (z) are both non- 
negative real numbers, it is also clear, dropping first one and 
then the other of these quantities from the last equation, that* 

(1) M £ (ROO 

(2) |z| ^ £f 0) 


* We must always keep in mind the fact that the complex numbers cannot be 
ordered and that greater than and less than have meaning only when applied 
to real numbers. 


642 ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE CHAP. 14 

Moreover, from the definition of conjugate complex numbers, it 
follows that 

(3) \»\ - |V| 
and 

(4) z ■ z = |z| s 

Also, from Eqs. (4) and (5), Sec. 14.3, for the products and 
quotients of complex numbers expressed in polar form, it is 
clear that 

(5) \ziz 2 \ = \zi\ • |z 2 | 
and 

(6) |s| = ra 2, *° 

Since the length of any side of a triangle must be equal to 
or less than the sum of the lengths of the other two sides, it fol- 
lows from the geometric addition of complex numbers (Fig. 14.4a) 


FIGURE 14.4 

The triangle 

. 

. 

inequality 
applied to com- 
plex numbers. 




(a) 

(&) 


that 

(7) |*i + **| ^ M + N 

This can readily be extended to three terms, for 
1*1 H~ *2 + 23I — I Zi + (Zi + *3)1 
= |*l| + \zz + *3| 
k 3 |*l| \Zi\ + |*3j 

The important extension to n terms is obvious : 

(8) 

h= 1 *=1 

It is also geometrically evident that the length of any side 
of a triangle must be at least as great as the difference of the 
lengths of the other two sides (Fig. 14.46). Hence, 

(9) |*i ~ 2*| ^ | |«i| - M [ ^ 0 

If it happens that |zj| is greater than or equal to |z 2 |, the outer 
absolute-value signs on the right are, of course, unnecessary. 



SEC. 14.4 


ABSOLUTE VALUES 


643 


EXAMPLE 1 

Describe the region in the z-plane defined by the inequality <R(z) > l. 

If the real part of z is greater than 1, the image of z must be a point to the right of the line 
x = 1. Hence, the given inequality defines the set of all points in the half plane to the right of 
this line. Since the equality sign is not included in the definition of the region, points actually on 
the line x - 1 do not belong to the region. 


EXAMPLE 2 

What region in the z-plane is defined by |z — z„| g 9? 

In words, the given inequality asserts that the distance between the image point of z and 
the fixed point which is the image of z 0 is equal to or less than 9. This clearly defines the set of 
all points within and on the boundary of the circle of radius 9 which has the image of z„ as its 
center (Fig. 14.5). In the work that lies ahead, we shall frequently have to consider regions of 
this type. 


FfGURE 14.5 
The circular 
region 
|z - «o| ^ 9. 



If w = (z + i)/(iz + 1), show that the restriction 3{z) £ 0 implies the restriction \w\ g 1. 

Since we are asked to establish a certain property of \w\, our first step is to compute this 
quantity. This can be done in various ways, but it is probably most convenient to construct the 
product 

. z + i 

’ iz + 1 


n-feri) 


w • ® - M* 

Since the conjugate of a quotient is the quotient of the conjugates, this can be written as 
z+i z+i 


Moreover, the conjugate of a sui 

i i s z+i 
M 2 = 7— — 

IZ + 1 

Finally, since l = —i and is - l 

i z + i 

M 2 = r— r 
tz + 1 


iz + 1 

is the sum of the conjugates; hence we have further 
z + i 
z + 1 

= —iz, we have 
S - i ( zz + 1) - i(z - z) 


■iz + 1 
+ 1 + 2 a(z) 


(zz + 1) + i{z — z) 


zz + 1 - 2 a(z) 

Now, zz + 1 is a positive quantity. Hence, it is clear that, if $(z) ^ 0, as given, then the numera- 
tor of the last fraction is equal to or less than the denominator. Thus |w| 2 , and, hence, |toj, is at 
most equal to 1 under the given conditions. 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


Since the restriction S(z) < 0 implies that z lies on or below the real axis in the plane in 
which z is plotted and since |u>| g 1 implies that- w lies on or inside the unit circle in the plane in 
which w is plotted, it follows that the given relation 

i_±i 

W ~ iz + 1 

can be thought of as a transformation, or mapping, which sends the lower half of the z-plane, 
point by point, into the region consisting of the unit circle and its interior in the w-plane. Map- 
pings of this sort are of considerable importance in applied mathematics, and in Chap. 17 we 
shall examine their properties in more detail. 


1 

2 

8 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 
16 


EXERCISES 


If a and h are real, show that 


z + ib 
\b + ia 

Find |*|, <Jt(z), and S(z ) if z « [(3 + 4i)(12 - 5i)]/2i. 
Under what conditions will |zi + 22 ! = M 

_W 


1. Is this true if a and b are not real? 


Show that 


1 , <-> r- — ; ; — ry Under what conditions will the equality sign hold? 

I *1 + ** 1 I !*il - M I _ 

Show that I® + iy\ g (i®| +• \y\)/\/ 2. Under what conditions will the equality sign hold? 
Show that jzi — z 2 | s + |zi + zs| a «■ 2[zi| a + 2|z a | a . 

^ 2 ^ a positive constant different 

from 1, is a circle. What is the locus if k = 1? if k — 0? if k < 0? 

What region in the z-plane is defined by the inequalities 0 < <R(z) 2s tf(z)? 

What region in the z-plane is defined by the inequality |z — 1| ^ tfl(z)? 

If w = i(l — z)/(l + z), prove that |z| < 1 implies 9(w) > 0. 

Without using any properties of the polar representation of complex numbers, prove that 
\SlZi\ - M • N. 

Prove algebraically that | z% + z s | g |zi| + |z a |. [Hint: Consider the identity |zi + z a | a = 
(zi 4- zt){zi + z 2 ).] 

Prove algebraically that \z% — z a | S i [zi| — |zr 2 | | ^ 0. 

Prove that, if z + 1/z is real, then either z is real or the absolute value of z is 1. 

If zt, Zt, . . . , z„ and wi, w«, ... , to, are complex numbers, prove that 


| x | a = X m* X 

i-l 1=1 t=l 

This result is sometimes known as Cauchy’s inequality. [Hint: Consider the discriminant of 
the quadratic equation X (|z»|X — |u>;|) a ■' 0.] 


14.5 

Functions of 0 complex variable 

If z = x + iy and w — u -j- iv are two complex variables, and if, 
for each value of z in some portion of the complex plane, one or 
more values of w are defined, then w is said to be a function of z, 
and we write 
w = f(z) 


SEC 14.5 


FUNCTIONS OF A COMPLEX VARIABLE 


645 


( 1 ) 


If w — /(*), that is, if 
u + iv — f(x + iy ) 

it follows that the real numbers u and v are themselves determined 
by the real numbers x and y. Hence, the assertion that w is a 
function of 2 = x + iy can also be written 

w — u(x,y) + iv(x,y) 

where u(x,y) and v(x,y) are real- valued functions of the real 
variables x and y. Clearly, whenever a value of z is given, values 
of x and y are thereby provided, and, thus, one or more values 
of w are determined by (1). For example, if 

w = f(z) = (x 2 - y) + (x + y 2 )i 

and if z — 1 -j- 2 i, then x — 1 and y — 2, and, thus, 

/(I + 2 i) = (l 2 - 2) + (1 + 2 2 )i = -l + hi 

If w is defined as a function of z in the form (1), it may be 
possible by suitable manipulations to rearrange the expression 
u{x,y) -f iv(x,y) so that x and y occur only in the binomial 
combination x + iy. For instance, 

w = (x 2 — y 2 ) + 2ixy 
is immediately recognizable as 
«» = ■(« + iyY = z 2 


and 


x 2 + % 


is nothing but the standard complex form of 

= 1 = 1 
x + iy z 

On the other hand, it may be impossible to express to in a form 
involving only the explicit combination x + iy without using 
such “artificial” expressions as (R(s) = x and $(z) = y, with 
which, of course, any formula in x and y can be written in terms 
of z. For instance, unless we resort to “artificial” functions, no 
rearrangement of the formula 


w — 7x + 3 iy — 4(R(z) + 3 z — 7 z — US (z) = 5z + 2z 

can reduce w to explicit dependence on z alone. In our work and, 
in fact, in most applications of complex variable theory, the 
only functions of interest will be those which can be written in 
terms of z alone, without recourse to z, (R(z), 8(z), and similar 
expressions. 

Frequently our interest in a function will be restricted to 
its behavior at the points of some specified part of the z-plane. 


646 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


However, before we can undertake discussions of this sort, we 
must define and explain some of the simpler properties of the 
sets of points we intend to consider. 

By a neighbo rhood of a point zg we mea n any set consistin g 
of all tlnTpo^^ ' an Inequ ality of the form 

\z — 2 0 | < e e > 0 

f Geometrically speaking, a neighborhood of z a thus consists of all 
f the points within but not on a circle having z 0 as center. A point 
I zo belonging to a set S is said to be an interior point of S if there 
\ . exists at least one neighborhood of z 0 whose points all belong to S. 
\ A set each of whose points is an interior point is said to be open, 
j A point zo not belonging to a set S is said to be exterior to S if 
J there exists at least one neighborhood of z 0 none of whose points 
belongs to <S’. Intermediate between points interior to S and 
points exterior to S are the boundary points of S. A point z 0 is 
I said to be a boundary point of a set S if every neighborhood of 
f z Q contains both points belonging to S and points not belonging to 
j S. The boundary points of a set may or may not belong to the 
j set, depending upon its definition. 

I A point zo is said to be a limit point of a set if every neighbor- 
hood of the point contains at least one point of the set distinct 
from zo. A set which contains all its limit points is said to be 
closed. Obviously, a set can be defined to contain some but not 
all of its limit points; hence it is clear that a set may be neither 
open nor closed. 

If a set S has the property that every pair of its points can 
be joined by a polygonal are whose points all belong to the set, 
it is said to be connected. An open connected set is said to be a 
region or a domain. A set consisting of a region together with all 
its limit points is called a closed region. A connected set S with 
the property that every simple closed curve* which can be drawn 
in its interior encloses only points of S is said to be simply con- 
nected. If it is possible to draw in S at least one simple closed 
curve whose interior contains points not belonging to S, then *S is 
said to be multiply connected.! If there exists a circle with center 
at the origin enclosing all the points of a set S; i.e., if there exists 
a number d such that 

\z\ < d for all z in S 

then S is said to be bounded. A set which is not bounded is said 
to be unbounded. The set consisting of the points between two 
concentric circles is called an annular region or an annulus. 


* See footnote to Theorem 1, See. 12.4. 

t In two dimensions the definitions of simply connected sets and multiply 
connected sets given at the end of Sec. 12.5 are clearly equivalent to those 
of the present section. 



SEC. 14.5 


FUNCTIONS OF A COMPLEX VARIABLE 


647 


The preceding ideas are illustrated in Fig. 14.6a, where the 
three sets 

Si*. | z ~ So| < TT 

S 2 : ?T ^ \z - So| < r 2 

S 3 : n ^ \z — s 0 | 


FIGURE 14.6 
Typical regions 
in the complex- 
plane. 


(a) ( b ) 

are shown. The set Si consists of all points interior to the circle 
j z — So| = ri. It is bounded and simply connected. Since points 
on the boundary circle \z — ?o| = n are not included in the 
definition of Si, the set is open and is, therefore, a domain; in 
particular it is a neighborhood of z 0 . The set S 2 consists of all the 
points in the annulus between the circles | z — zo| = ri and 
| z — «o| = r 2 plus the points on the inner boundary of the annulus 
but not those on the outer boundary. Since S 2 thus contains some 
but not all of its boundary points, it is neither open nor closed 
and is, therefore, neither a domain nor a closed region. Clearly, 
there are closed curves in S 2 , namely, any curve encircling the 
inner boundary, which will enclose points not belonging to S 2 , 
namely, the points of Si. Hence, S 2 is multiply connected. Obvi- 
ously, 5 2 is bounded. The set Sz consists of all points on and out- 
side the circle | z — zo\ ~ r 2 . It is, therefore, unbounded, closed, 
and multiply connected. According to our definition, it is a closed 
region. 

Because simply connected regions are in many respects 
easier to work with than multiply connected regions, it is often 
desirable to be able to reduce the latter to the former. This can 
always be done by modifying the given multiply connected region 
through the introduction of auxiliary boundary arcs, or crosscuts, 
joining boundary curves that were originally disconnected. The 
effectiveness of this technique is illustrated in Fig. 14. 6&, which 
shows a closed region originally multiply connected with one outer 
boundary curve C and two inner boundary curves C' and C" . 
The introduction of the auxiliary boundary arcs A'B' and A"B" 
clearly makes it impossible to draw closed curves which lie 
entirely in the interior of the modified region and at the same 




648 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


time encircle either of the inner boundaries C" and C". The modi- 
fied region is, therefore, simply connected, as required. 

It will often be necessary for us to consider the limit of a 
function of z as z approaches some particular value zo . The basis 
for this is the following definition : 


DEFINITION 1 

If f(z) is a single-valued function of z and w 0 is a complex constant and if, for 
every e > 0, there exists a positive number 5(e) such that |/(z) — w Q \ < e for all 
z such that 0 < \z — z 0 | < 6, then Wo is said to be the li. it of/(z) as z approaches zo. 

In less technical terms, u> 0 is the limit of f(z) as z approaches z 0 
provided that/(z) can be kept arbitrarily close to w 0 by keeping 
z sufficiently close to but distinct from Zo. 

EXAMPLE 1 

If/(z) * (* + y)7(x 4 + 2/ s ). show that 

lim [lim/(z)] = 1 and lim [lim /(z)] = 1 

x-*0 y -* 0 K-+0 r-*0 

but that lira f(s) does not exist. 
z-*0 

Clearly, lim [lim /(z)] = lim I lim ^ ^ I = lim (1) = 1 

x~*0 v-*0 x * + y * J x-*0 

and lim [lim /(z)] - lim lim — = lim (1) = 1 as asserted. 

V — >0 x — »Q tf+0 L 0 ® 4” 2/ 2 J y—*0 

On the other hand, for lim /(z) to exist, it is necessary that/(z) approach the same value along all 
>0 

paths leading to the origin, and this is not the case; for along the paths y = mx we have 


lim /(z) 
2-»0 


lim ^±4’- 
z— »o s 2 + y i 


(1 + my 


(1 + my 
1 + m 4 


The limiting value here clearly depends on m; that is,/(z) approaches different values along 
different radial lines, and hence no limit exists. 

Closely associated with the concept of a limit is the concept 
of continuity: 


DEFINITION 2 

The function /(z) is continuous at the point z 0 provided that lim fiz) = /(z 0 ). 

In other words, for a function to be continuous at a point z 0 , the 
function must have both a value at that point and a limit as z 
approaches that point, and the two must be equal. If /(z) is con- 
tinuous at every point of a region, it is said to be continuous 
throughout the region. 

In addition to the fundamental theorems on limits we en- 
countered in calculus, there are various theorems on continuous 
functions which we shall need from time to time. For the most 
part these appear almbst self-evident, although their proofs are 



SEC. 14.5 


FUNCTIONS OF A COMPLEX VARIABLE 


649 


by no means trivial. We shall merely list them here, and refer to 
standard texts on advanced calculus for their proof. * 

THEOREM 1 

Sums, differences, and products of continuous functions and quotients of con- 
tinuous functions, provided the divisor functions are different from zero, are 
continuous. 

THEOREM 2 

A continuous function of a continuous function is continuous. 

THEOREM 3 

A necessary and sufficient condition that 
f(z) = u(z,y) + iv(x,y) 

be continuous is that the real functions u(x,y) and v(x,y) be continuous. 
THEOREM 4 

If f (z) is continuous at a point Zo and if/(zo) ^ 0, then there exists a neighborhood 
of Zo throughout which f(z) is different from 0. 

THEOREM 5 

If / (z) is continuous over a bounded, closed region R, then there exists a positive 
constant M such that \f(z)\ < M for all values of z in R. 

EXERCISES 

1 If f(z) = xy + t'(x 3 - y 3 ), what is/( —1 + 2t)? 

2 If/(z) = z +'('*)* + 8{zz), what is/(2 + »)?' 

3 Express (2 xy + 2* — 1) — i(x 3 - y 3 — 2 y) as a polynomial in the binomial argument 
z — x + iy. 

4 Express x 3 + ty 3 in terms of z and z. 

5 Describe each of the following sets of points, telling whether it is bounded or unbounded, 
open or closed, and simply or multiply connected: 

a 3(z) >0 b 2'£'H £ 3 

c |z — 1| > 4 dOg <R(z) g 1 

e 0 £ 8(z) < <fl(z) f |z 3 -1\&H 

6 Show that lim — — — does not exist. 

x 2 4- y 2 

7 Show that lim — — ■■■■■- does not exist, even though this function approaches the same 

z — + 0 X* + y" 

limit along every straight line through the origin. 

8 If f{z) = { S ' n 1/,y y ^ n show that lim [lim /(z)] and lim f(z ) exist and are equal, 

i u y — U y —*0 x — >0 2-»0 

but that lim [lim /(z)] does not exist. 

x -*0 y —>0 

9 Prove Theorem 2. 

10 Show that every neighborhood of a limit point of a set S contains infinitely many points of S. 


* See, for instance, A. E. Taylor, “Advanced Calculus,” pp. 494-503, Ginn 
and Company, Boston, 1955. 


650 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


14.6 

Analytic functions 


( 1 ) 


The derivative of a function of a complex variable w = /(g) is 
defined as 


dw 

dz 


= w' = f(z) = lim 

da — >0 


/(g + Az) - f(z) 
Az 


This definition is formally identical with that for the derivative 
of a function of a real variable. Moreover, since the general theory 
of limits is phrased in terms of absolute values, it is valid for 
complex variables as well as for real variables. Hence it is clear 
that formulas for the differentiation of functions of a real variable 
will have identical counterparts in the field of complex numbers 
when the corresponding functions of a complex variable are 
suitably defined. In particular, such familiar formulas as 


d(wi ± w 2 ) _ dm + dw 2 
dz dz ~ dz 

d(wiwi) _ dw i 

dz 


dwi 
' dz 


d(wi/wj) _ Wi(dw\'/dz) — wijdwz/dz) 
dz w-2 2 

d(w n ) „ , dw 
, -- = nw n ~ l -^~ 
dz dz 


Wi 0 


are valid when wi, w*>, and w are differentiable functions of a com- 
plex variable z. However, Az = Ax + i Ay is itself a complex vari- 
able, and the question of just how it is to approach zero involves 
difficulties which have no counterpart in the differentiation of 
functions of a real variable. 

In Fig. 14.7, it is clear that Az can approach zero; i.e., that a 
point 


Q : z + Ag 


can approach the point P:z, along infinitely many paths. In 




SEC. 14.6 


ANALYTIC FUNCTIONS 


651 


particular, Q can. approach P along the line AP on which Ax is 
zero or along the line BP on which. Ay is zero. Clearly, for the 
derivative of f{z) to exist , it is necessary that the limit of the difference 
quotient (1) be the same no matter how As approaches zero. How 
severe a restriction this is can be seen by considering the simple 
function 


w = f(£) — z — x — iy 


Giving to z the increment Az — Ax + i Ay means that x 
changes by the amount Ax and y changes by the amount Ay. 
Hence, 

f(z + Az) - f{z) = [(g + As) - i(y + Ay)] — (x - iy) = Ax - i Ay 
Az Ax + i Ay Ax + i Ay 

Now, if Az is real, so that Ay = 0, we have 


,. Ax — i Ay 

lim -r r - ■■ . ■ 

Az—tO Ax + i Ay 


Ax 
lim — 
a*-»o Ax 


1 


On the other hand, if Az is imaginary, so that Ax — 0, we have 


lim 


= lim - 


A 2-+0 Ax + i Ay Ay — +0 i Ay 


More generally, if we let Az — > 0 in such a way that Ay = m Ax, 
we have 


li m Ax — i Ay _ As — im Ax 1 — im _ (1 — m 2 ) — 2m 
Az— >o Aa: + i Ay a*-*o A.t + m Arc 1 + im 1 -f m- 

Thus, there are infinitely many complex values which the differ- 
ence quotient for f(z) — x — iy can be made to approach by 
choosing properly the manner in which A z shall approach zero. 
It is, therefore, apparent that z — x — iy has no derivative. 

That a function as simple as f(z) — x — iy should have no 
derivative seems at first glance a discouraging state of affairs. 
However, there are many functions of z which do have derivatives, 
and in applications it is these functions which are of importance. 
Our immediate task is to identify these functions by obtaining 
conditions for the existence of the derivative of a function of a 
complex variable. 

To do this, consider 

w = f(z) = u(x,y) + iv(x,y) 


(2) = lim S? 

dz az— >0 A z 


By definition, 
Aw 


_ 1 . [u(x + A x,y-\- Ay) + iv(x + Ax, y + Ay)] 
Ax + i Ay 

Ay— »0 


- \u(x,y) + iv(x,y)] 



ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


652 


(3) 


(4) 


(5a) 

(56) 


Now, if Az is real, i.e., if Ay — 0, we obtain 


dw _ r [u(x + Ax, y) -f iv(x Ax, y)} — [ u(x,y ) + iv(x,y) 1 
dz ~ Ax 


li m [ u(x + Ax, y) — u{x,y) 
a*— » o |_ Ax 


+ i 


v(x + Ax, y) — v{x, 


|d/)~ 


The two difference quotients which appear in the last expression 
are precisely those whose limits define the partial derivatives of 
u and v with respect to x. Hence, it appears that 


dw _ du . . dv 

dz dx 1 dx 

On the other hand, if A z is imaginary, i.e., if As = 0, we find 
from (2) that 

dw _ r [u(x, y + Ay) -f w(s, y + Ay)] — [u( x,y) + iv(x,y)\ 
dz ~ ij^o iAy 

,. f u(x, y + Ay) — u(x,y) , .r(a;, y + Ay) — w(a;,Y/) ”] 

= ia [ tav + * m J 

_ 1 du dv 

~ i dy + dy 


or, finally, 
dw _ dv _ . du 
dz dy dy 

Thus, if the derivative dw/dz is to exist, it is necessary that 
the two expressions we have just derived for it be the same. Hence, 
from (3) and (4), 

du . . dv _ dv .du 

dx t dx dy % dy 

which requires that 


du __ dv 
dx ~ dy 
du _ dv 

dy dx 


These two extremely important conditions, which are known 
as the Cauchy-Riemann equations,* have arisen here from a con- 
sideration of only two of the infinitely many ways in which Az 
can approach zero. It is, therefore, natural to expect that severe 
additional conditions will be necessary to ensure that along these 
other paths Aw/Az will also approach the same limit dw/dz. This 
is not the case, however, and it can be proved without great 


* After the French mathematician Augustin Louis Cauchy (1789-1857), 
and the German mathematician George Friedrich Bernhard Riemann (1826- 
1866). 



ANALYTIC FUNCTIONS 


difficulty* that, if u and v together with their first partial deriva- 
tives Uxj u«, v r. v„ are continuous in some neighborhood of the 
poin t g 0 , then the Cauchy-Riemann equations are not only ne ces- 
sary but also sufficient condi tion s for the existence of a derivative 
of w = u(x,y ) + i v{x,y ) at z = z 0 . 

" u ~ i: P9? s sses a d erivative at z — z 0 and at every 
po n i s< n ieu hb >rh( od oi , diet said 1 be analytic 

at z 0 , and zo is calle d a regular point of the functiomJ f f(z) is not 
analytic at z 0 , but if every neighborhood of z n contains points at 
which f(z) is analytic, then z o is called a singular point of J(z). A 
function analytic at every point of a region R we shall call 
analytic in R. Although most writers use this term, a few substi- 
tute such adjectives as regular and holomorphic. As a summary 
of our discussion we have the following theorem : 


THEOREM 1 

If u and v are real single-valued functions of x and y which, with their four first 
partial derivatives, are continuous throughout a region R , then the Cauchy- 
Riemann equations 

d u _ dv 
dy ~ dx 

are both necessary and sufficient conditions that/(z) = u(x,y) + w(x,y) be analy- 
tic in R. In this case the derivative of f (z) is given by either of the expressions 


du dv , 

= v and 

dx dy 


\ du . . dv N dv .du 

■ f(2) = to + l te f^-Vy-'Ty 


dx 


EXAMPLE 1 

• iy, we have u = x and v - 


-y. In this case 




ay 


-- 0 =■ 


-1 


and, although the second of the Cauchy-Riemann equations is satisfied everywhere, the first is 
nowhere satisfied. Hence, there is no point in the z-plane where dw/dz exists, which, of course, 
confirms our earlier investigation of this function. 


EXAMPLE 2 

For w — zz = x 2 + y 2 , we have u = 


* + y 2 and v = 0. In this case the partial derivatives 


dx 


= 2x 


dy 


= 2 y 


dx 


are continuous everywhere. However, the Cauchy-Riemann equations, which in this case are, 
respectively, 

2x =» 0 and 2y = 0 

are satisfied only at the origin. Hence, z = 0 is the only point at which dw/dz exists, and w — z2 
is nowhere analytic 


* See, for instance, Einar Hille, “Analytic Function Theory,” vol. 1, pp. 78- 
80, Ginn and Company, Boston, 1959. 


654 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


EXAMPLE 3 

For w = z* — (a: 2 — y 2 ) + 2ixy, v 


and the Cauchy-Riemann equations are identically satisfied. Moreover, the first partial deriva- 
tives of u and v are everywhere continuous. Hence, the derivative dw/dz exists at all points of 
the g-plane, and its value from either (3) or (4) is 

~ = 2x + 2 iy = 22 
dz 

This, of course, is exactly what formal differentiation according to the power rule would give. 


Analytic functions have a great many important properties, 
many of which we shall investigate in later sections. At this point 
we note only the following: 


PROPERTY 1 

If both the real part and the imaginary part of an analytic function have con- 
tinuous second partial derivatives, then they satisfy Laplace’s equation 


Bx 2 By 2 


PROOF Let w — u(x,y ) + u>(x,y ) be an analytic function of z. Then u and v 
must satisfy the Cauchy-Riemann equations, namely, 


du __ Bv , _ dv 

3x~ By and By T Bx 

If we differentiate the first of these with respect to x and the second with respect 
to y and add the results, we obtain the first assertion of. the theorem: 

3 hi __ 3 2 v 
Bx 2 Bx By 


BH 

W 

d-u Bhi 
Bx 2 + By 2 



The existence of the second partial derivatives and their continuity, which makes 
the order of differentiation in the cross partial derivatives immaterial, must here 
be assumed. Later we shall show r that an analytic function possesses not only a 
first derivative, but derivatives of all orders, which implies the existence and con- 
tinuity of all the partial derivatives of u and v. In exactly the same way it can be 
shown that v satisfies Laplace’s equation. 


A function which possesses continuous second partial deriva- 
tives and satisfies Laplace’s equation is usually called a harmonic 
function. Two harmonic functions u and v so related that u 4- iv 
is an analytic function are called conjugate harmonic functions.* 
This use of the word conjugate must not be confused with its use 
in describing z, the complex number conjugate to z. 


* The order in the pair ( u,v ) is important, as Exercise 6 makes clear. 


SEC. 14.6 


ANALYTIC FUNCTIONS 


655 


PROPERTY 2 

If w = u(x,y) + w(x,y ) is an analytic function of z, then the curves of the family 
u(x,y) = c are the orthogonal trajectories of the curves of the family v(x,y) = k, 
and vice versa. 

PROOF To prove this, we compute the slope of the general curve of each 
family by implicit differentiation, getting for the curves u{x,y) = c the expression 

( o\ dy = _ du / dx 

W dx du/dy 

and for the curves v(x,y) — k the expression 

(7\ dy __ dv/dx 

^ ' dx dv/dy 

By hypothesis, w u + iv is an analytic function. Hence, it follows from The- 
orem 1 that u and v satisfy the Cauchy- Riemann equations. Therefore, using these 
equations, the expression ( 7 ) for the slope of the general curve of the family 
v(x,y) — k can be rewritten 
dy _ du/dy 
dx du/dx 

which, at any common point, is just the negative reciprocal of the slope of the 
general curve of the family u(x,y) — c, as given by Eq. (6). This suffices to prove 
that the two families of curves are orthogonal trajectories, as asserted. 


PROPERTY 3 

If, in any analytic function w — u(x,y ) + iv(x,y), the variables x and y are replaced 
by their equivalents in terms of z and z, namely, 


x 


z + I 
2 


and 


— z 



then w will appear as a function of z alone. 

PROOF Although z and z are clearly dependent, since either is determined 
when the other is given, we can regard w, by virtue of the given substitutions, as 
formally a function of two new independent variables z and z. To show that w 

depends only on z and does not involve z, it is sufficient to compute ~ and verify 
that it is identically zero. Now, 

dw _ d(u + w) _ du §m\ j • ^ 4. Qll\ 

__ _ _ - -gz + 1 d z - ^ dz + dy dz) ■ 1 \ 3 x dz ^ dy dz) 

Moreover, from the equations expressing x and y in terms of z and z, we have 

dx _ 1 dy 1_ _ jf 

dz 2 dz ~~ 2i 2 

Hence, we can write 

dw _ (\ du i du\ . /l dv i dv\ _ 1 ( du __ dv\ i (du , 
dz ~ \2Tx^ 2lfy) ^ l \$dx + 2dy) ~ ‘2\dx dy) ' t " 2\dy ^ dx) 

Since w, by hypothesis, is an analytic function, u and v satisfy the Cauchy- 



656 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


Riemann equations, and, therefore, each of the quantities in parentheses in the 

last expression vanishes. Thus ~ s 0. Hence, w is independent of z, that is, 
dz 

depends on x and y only through the combination z — x + iy. 

EXERCISES 

1 At what points does { z — 2 )/[(z + l)(z 2 + 1)1 fail to be analytic? 

2 Show that at no point in the z-plane does the derivative of /(z) = (R(z) = x exist. Does this 
contradict the fact that according to the rules of calculus dx/dx = 1? Explain. 

3 Where are the Cauchy-Riemann equations satisfied for the function /(z) = xy 2 + ixhj? 
Where does /'(z) exist? Where is /(z) analytic? 

4 Verify by direct substitution that <R(z 3 ) and tf(z s ) satisfy Laplace’s equation. 

6 If it + iv is an analytic function, under what conditions, if any, will v + iu be analytic? 

6 If a and v are conjugate harmonic functions, show that v and — w as well as —v and u are. 
also conjugate harmonic functions, but that v and u are not. 

7 Show that the various values approached by the difference quotient of /(z) = z as Az — * 0 
along the lines y — mx all lie on a circle. 

8 Is the converse of Property 2 true? That is, if u(x,y) = c and v(x,y) — k are orthogonal 
trajectories, is u + tv necessarily an analytic function? 

9 Prove that, if /'(z) = v, then/(z) is a constant. 

10 If in the function f(z) — u + iv we take z in polar form, namely, 

z «* r(cos 0 + i sin 8) 

show that the Cauchy-Riemann equations become 

du 1 dv , dv 1 du 

— and 

dr rdd dr r d8 

11 If w is an analytic function of z =* r(cos 6 •+- i sin 6), show that 

dw , . . . dw sin 8 + i cos 8 dw 

— * (cos S-»sm« — 

dz dr r 98 

12 If f(z) is an analytic function, show that 

* (>*>')' + (| H’" 1 '™’ 

13 If /(z) and /(z) are both analytic functions, show that /(z) is a constant. 

14 If /(z) is an analytic function for which «* + v 2 is a constant, show that/(z) is a constant. 

15 Prove L’Hospital’s rule for analytic functions: If/(z) and g(z) are analytic functions in a 

region containing z 0 , if /(zo) — g(z 0 ) = 0, and if g'(z<>) 0, then lim ~~ = 

z-*z 0 <7(z) g' {zo) 


14,7 

The elementary functions of z 

The exponential function e* is of fundamental importance, not 
only for its own sake, but also as a basis for defining all the other 
elementary functions. In its definition we seek to preserve as 


SEC. 14.7 


THE ELEMENTARY FUNCTIONS OF z 


657 


( 1 ) 


( 2 ) 

(3) 

(4) 


(5) 


many of the characteristic properties of the real exponential func- 
tion e x as possible. Specifically, we desire that 
a e z shall be single-valued and analytic 
b de z /dz — e 2 

c e z shall reduce to e x when 3{z) =0 
If we let 


e* — u + iv 

and recall from- Eq. (3), Sec. 14.6, that the derivative of an 
analytic function can be written in the form 


m 


du .dv 
dx 1 dx 


then, to satisfy condition b, we must have 


du 

dx 


. dv 


dx 


u + iv 


Hence, equating real and imaginary parts, 
du 

Tx~ u 

dv 

dx v 

Now, Eq. (2) will be satisfied if 
u = e x <t b(y) 

where 4>{y ) is any function of y. Moreover, since e z is to be analytic 
(condition a), u and v must satisfy the Cauchy-Riemann equa- 
tions; hence, using the second of these equations, Eq. (3) can be 
written 
du 

Ty ~ V 

Differentiating this with respect to y, we obtain 
dhi dv 

dy 2 dy 

or, replacing — by according to the first of the Cauchy- 
Riemann equations, 
dhi _ du 

dy 2 dx 

Finally, using (2), this becomes 
d 2 u 

-r-7, =' —u 
dy 2 

which, on substituting u — e x <f>(y ) from (4), reduces to 
e x <j>"(y) = - e x 4>(y ) or <f>"(y) = -<t>(y) 



658 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


This is a simple linear differential equation whose solution 
can be written down at once: 

4>{y) — A cos y + B sin y 

Hence, from (4), 

u = e x <f>(y) = e x (A cos y + B sin y) 
and, from (5), 

du . . . 

V ly = ~ eX (~ A sm y + B cos y) 

Therefore, from (1), 

e ‘ = u + iv = dC4 cos y + B sin y ) + *(-4. sin y - 5 cos y)] 

If this is to reduce to e x when y = 0, as required by condition c, 
we must have 

e x = e x (A — iB) 

which will be true if and only if A = 1 and B = 0. 

Thus we have been led inevitably to the conclusion that if 
there is a function of z satisfying the conditions a, b, and c, then 
it must be 

( 6 ) e* = «"+* = e*(cos y + i sin y) 

That this expression does, indeed, meet our requirements can be 
checked immediately; hence, we adopt it as the definition of e z . 

It is important to note that the right-hand side of (6) is in 
standard polar form. Hence, 

mod e * = \e*\ — e x and arg e* — y 

The possibility of writing any complex number in exponential 
form is now apparent, for, applying (6), with x — 0 and y ~ 6 
we have 

(<0 cos d + % sin 6 = e iB 

and thus 

r( cos & + i sin 6) = re i9 

The fact that the angle, or argument, of a complex number is 
actually an exponent explains why the angles of complex numbers 
are added when the numbers are multiplied and subtracted when 
the numbers are divided, as we found in Sec. 14.3. 

From the relation 

e ie = cos 6 -f- i sin 6 

and its obvious companion 

e-‘* = cos ( -6 ) + i sin (-6) = cos 0 - i sin 6 


SEC. 14.7 


THE ELEMENTARY FUNCTIONS OF 2 


659 


(9) 

( 10 ) 


( 11 ) 

(12) 

(13) 

(14) 


we obtain, by addition and subtraction, the so-called Euler 
formulas 


and 


On the basis of these equations, we extend the definitions of the 
sine and cosine into the complex domain by the formulas 



From these definitions it is easy to establish the validity of such 
familiar formulas as 
= i 

= cos Si cos Si + sin Z\ sin Si 
= sin Zi cos 32 ± cos z\ sin z 2 

= — sin s 

= COS 3 

If we expand the exponentials in (9), we find 

gt(*+tv) _|_ g-L(s-Hv) 
cos * = 2“ 

_ e~ v e ix + e v e~ ix 
2 * 

— e ~~ g ( cos x + * x ) e a (cos x — i sin x) 


cos 2 3 -f sin 2 3 
cos (3 1 + 32) 
sin (zi ± 32) 
d(cos 3) 
dz 

d(sin 3 ) 
dz 


e v 4. e -v , . e v — e~ v 
— cos x x sm x — ^ 


or, using the usual definitions of the hyperbolic functions of real 
variables, 

cos z = cos ( x + iy) — cos x cosh y — i sin x sinh y 

Similarly, it is easy to show that 

sin s = sin ( x + iy) — sin x cosh y + i cos x sinh y 

In particular, taking x = 0 in (11) and (12), we find 

cos iy — cosh y 

sin iy — i sinh y 


The remaining trigonometric functions of z are defined in terms 
of cos 3 and sin z by means of the usual identities. 



660 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


EXAMPLE 1 

What is cos (1 4- 2 i)? 

By direct use of (11) we have 

cos (1 + 2 i) = cos 1 cosh 2 — i sin 1 sinh 2 

- (0.5403) (3.7622) - 7(0.8415) (3.6269) 

- 2.033 - 3.0527 

EXAMPLE 2 

Prove that the only values of z for which sin 2 = 0 are the real values s = 0, ±v, ±2ir, . . . . 

From (12), sin z = sin x cosh y + i cos x sinh y. Hence, if sin z is to vanish, it is necessary 
that simultaneously 

sin x cosh y - 0 
cos x sinh y =» 0 

Since y is a real number, it follows from the familiar properties of the hyperbolic cosine that 
cosh y ^ 1. Hence, the first of these equations can hold only if sin x = 0; that is, if 
x — 0, ±x, ±2x, . . . 

But for these values of x, cos x, being either 1 or — 1, can never vanish. For the second equation 
to hold, it is therefore necessary that sinh y = 0. Since y is real, the familiar properties of the 
hyperbolic sine can be invoked, leading to the conclusion that 

y - 0 

Hence, the only values of z for which sin z = 0 are of the form 
z ■«*> iiir -j- Of = nv n = 0, ±1, ±2, . . . 

The hyperbolic functions of z we define simply by extending 
the familiar definitions into the complex number field : 

(15) cosh z = e 

(16) sinh z — - — ~~ 

By expanding the exponentials and regrouping, as we did in 
deriving (11), we obtain without difficulty the formulas 

(17) cosh z = cosh x cos y -f i sinh x sin y 

(18) sinh z ~ sinh x cos y + i cosh x sin y 
In particular, by setting x = 0, we find 

(19) cosh iy — cos y 

(20) sinh iy — i sin y 

The remaining hyperbolic functions are defined from cosh z and 
sinh z via the usual identities. 

The logarithm of z we define implicitly as the function 
u' = In z, which satisfies the equation 


( 21 ) 



SEC. T4 .7 


THE ELEMENTARY FUNCTIONS OF 2 


661 


If we let w = u + 2» and 0 = re ie , Eq. (21) becomes 


Hence, e“ = r or « = In r and v — 6 . Thus, 
w = u + iv — In r + id 
(22) — In jzj -f i arg 2 

If we let 0 1 be the principal argument of 2, i.e., the particular 
argument of 0 which lies in the interval — ir < 0 g ir, Eq. (22) 
can be written 

(22a) In z = In |z| + i($ 1 + 2»ir) n = 0, ±1, +2, , . . 

which shows that the logarithmic function is infinitely many- 
valued. For any particular value of n, a unique branch of the 
function is determined, and the logarithm becomes effectively 
single- valued. If n = 0, the resulting branch of the logarithmic 
function is called the principal value. Any particular branch of 
the logarithmic function is analytic, for, differentiating the 
definitive relation z — e w , we have 
dz 

fa ~ eW " 2 

dw d(ln z) 1 
or dz~ dz ~ z 

For a particular value of n the derivative of In z thus exists for 
all 2 ^0. 

By means of (22a) the familiar laws of logarithms which 
hold for real variables can be established for complex variables 
as well. For example, to show that 

In — = In Zi — In 0 2 
02 

let Z\ = rie i6 ' and z 2 = r 2 e ie * 

where 61 and 0 2 are the principal arguments of 01 and 22, respec- 
tively. Then, 

In Zi — In Z2 — [In r\ + i {0 1 + 2nur)] — [In r 2 + i( 0 2 + 2ti2t)] 

= [In ri — In r 2 ]-+ i[(0i — 62) + 2(ni — n^r) 

ri 


-- In 


- + *[(0i ~ 0s) + 2n$rc] 
2 

0 i I . . 01 

— + * arg — 


= In- 


( 23 ) 


General powers of 0 are defined by the formula 


which generalizes a familiar result for real variables which we 
frequently found useful in solving linear first-order differential 
equations. Since In 0 is infinitely many-valued, so, too, is z a , in 


662 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


general. Specifically, 

ga __ gain 2 _ ga[ln Ul+{(fl,+2n»-)l _ gain lzlga9,ig2naTi 

The last factor in this product clearly involves infinitely many 
different values unless a is a rational number, say p/q] in which 
case, as we saw in our discussion of de Moivre's theorem in Sec. 
14.3, there are only q distinct values.* 


example 3 

What is the principal value of (1 + i) 4- *’? 

By definition, 

(1 + i ) 7 -' = €<*-0 m 

a e (2 — i)nn\ / 2+i0r/4+2mr)] 


The principal value of this, obtained by taking n = 0, is 

V2+»V/4) = „(21nV2+»-/4)+*(-ln -s/2+^/2) 


4 j^cos - In V2^ + i sin - In \/2^ j 


» e ln 4+,r,4 [sin (In ■y/2) + i cos (In \/2)l 


- e I -4785(si n 0.3466 + i cos 0.3466) 


= 1.490 + 4.126f 

The inverse trigonometric and hyperbolic functions we define 
implicitly. For instance, 
w ■— cos -1 z 

we define as the value or values of w which satisfy the equation 

e' w 4* &~ tw 
z — cos W ~ 


From this, by obvious steps, we obtain successively 
e 2iw — 2 ze iw + 1=0 
e iw = 2 ± \/z 2 ~ 1 

and, finally, by taking logarithms and solving for w, 


(24) 

w = cos -1 z = — i\n{z ± \/ z 2 ~ 1) 

Since the logarithm is infinitely many-valued, 
Similarly, we can obtain the formulas 

(25) 

sin -1 z = —i In (iz ± y/l — z~) 

(26) 

, , i, i + z 

tan 1 z — H In 

2 1 — z 

(27) 

cosh" 1 z — In (z ± \/ z 2 — 1) 

(28) 

sinh -1 z — In (z ± y/ z 2 + 1) 

(29) 

, u 1 1 . 1+0 

tanh" 1 z - zr In z 

2 1—2 


* However, in the particular case z = e the expression z a = e“ is single- 
valued for all values of a, rational or not, since e“r+'“; was defined simply as 
e a ' (cos ai + i sin «<), which is clearly a unique complex number. 


INTEGRATION IN THE COMPLEX PLANE 


From these, after their principal values have been suitably defined 
by choosing the positive square root and the principal value of 
the logarithm in each case, the usual differentiation formulas can 
be obtained without difficulty. 


1 Prove that cos 2 z + sin 2 z = 1. 

2 Prove that cos (zi + z«) = cos z t cos z 2 + sin zi sin z«. 

3 Prove that sin (z i ± z 2 ) = sin zi cos z« ± cos zi sin z 2 . 

4 Prove that d(e os z)/dz - — sin z. 

6 Prove that d ( sin z)jdz — cos z. 

6 Express each of the following in the form a + ib, where a and b are decimal fractions: 


a sin (2 — i) b cosh (1 + 5 

d The principal value of In (—3 + 4i) 
e The principal value of (1 — i) 2-3 *' 


c sinh (2 + 3 i) 


7 Show that the various values of (1 + i) 1- » differ only in their lengths. 

8 Prove that there is no value of z for which e* = 0. 

9 If g{x,y) is a real function of x and y, what is 

10 Prove that e* = e‘. 

11 Prove that cos z = cos z. 


12 Is In z = In I? 


13 Is sin z = sin S? 

14 Prove that the only zeros of cos z are the values ±v/2, ±3ir/2, ±5ir/2, .... 

16 Find all solutions of the equation sin z = 3. 

16 Find all solutions of the equation cosh z = —2. 

17 Find all solutions of the equation e* == —2. 

18 By inspection, e° > 0 and e iT < 0; yet by Exercise 8 there is no value of z for which e* = 0 
even though e‘ is everywhere continuous. Explain. 

19 Show that Rolle’s theorem fails to hold for the function e 1 ' 2 — 1, even though the conditions 
of the theorem appear to be satisfied with respect to the two values z = 0 and z = 27 t. 
Explain. 

20 Show that |sin z| 2 = sin 2 x + sinh 2 y and that |cos z| 2 = cos 2 x + sinh 2 y. What is |sinh z| 2 ? 
What is jcosh z| 2 ? 

21 If z — x + iy, show that |sinh y | g |sin z| ^ cosh y. 

22 If z = x + iy, show that |sinh i/| ;£ |cos z| ^ cosh y. 

23 a If |z| S 1, show that |sin z| g %\z\. 

b Obtain an upper bound for |cos z|, given that |zj Sj 1. 

24 Show that sin z and cos I are not analytic functions of z. 

„ sin 2x + i sinh 2y 

26 Prove that tan z = — — • 

cos 2x + cosh 2 y 


Integration in the complex plane 


Line integrals in the complex plane are defined as follows: Let 
f(z ) = u{x,y) + iv(x,y) be any continuous function of z, analytic 
or not, and let C be a seetionally smooth arc joining the points A 
and B. Divide C into n subintervals As k by the points z k (k = 1, 




664 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


2, . . . , n — 1), and let Az k be the infinitesimal chord deter- 
mined by A Sk. Finally, in each subinterval choose an arbitrary 
point ik = £* + iiik (Fig. 14.8). Then, if it exists, the limit of the 
sum 



as n becomes infinite in such a way that the length of each chord 
Az k approaches zero is called the line integral of j(z) along C: 

(2) f f(z) dz = lim Y /(fo) A z k 

k = i 

In the special case when A and B coincide and C is a closed curve, 
the integral in (2) is often called a contour integral and is some- 
times represented by the symbol 

£f{z) dz 

In working with complex line integrals it is frequently nec- 
essary to establish bounds on their absolute values. To do this, 
let us return to the definitive sum (1) and apply to it the funda- 
mental fact that the absolute value of a sum of complex numbers 
is less than or equal to the sum of their absolute values [Eq. (8), 
Sec, 14.4]. Then, 

I | m Az» | S i I /(!*> Az*| = V |/(r,)| |Az*| 

*“ 1 *- 1 *=1 

the last equality following from the fact that the absolute value 
of a product is equal to the product of the absolute values [Eq. 
(5), Sec. 14.4]. As » —> oo , this yields a corresponding inequality 
for the integrals which are the limits of the respective sums: 

\J c f( z ) dz \ £ J c \f(z)\ \dz\ 


(3) 


111X13 


SEC. 14.8 


INTEGRATION IN THE COMPLEX PLANE 


665 


( 4 ) 

The integral on the right is the real line integral 

f c \ /v ~ + »* Vidxy + {dyY = f c yV + v 2 ds 

where ds is the differential of arc length on C, which of course 
exists since C is assumed to be sectionally smooth. In particular, 
if f(z) = 1, we have the simple but important result 

f c \di\ = f c d* = L 

(5) 

where L is the length of the path of integration. Since f(z) is 
assumed to be continuous on the path of integration, including 
the end points A and B, it follows that/ ( 2 ) is a bounded function 
of 2 on the path of integration; that is, that there exists a constant 
M such that 1 /( 2 ) | ^ M for all values of 2 on C. Hence we have, 
from (3), 

J c M * | 5 } c I/Ml 1*1 S f c M\dz\ = M f c |*| 

Therefore, using (4), we obtain the important inequality 
\f c fc)dz\£ML 

fa(&,i7fc) + 

k = l 

where M is any bound for \f(z)\ on C, and L is the length of the 
path of integration. 

Complex line integrals can readily be expressed in terms of 
real integrals. For the sum (1) ean be written 

w (£*, 17 *)] (Ax k + i Aijk) = l [u(£k,rik) A Xk — v(h,Vk) A*/*] 

L-= 1 

(6) 

+ i y [v(£k,Vk) A Xk + u(%k,Vk) Ay k ] 

1 

and, in the limit, the last expression yields the relation 
j c f(z) dz — J c udx — v dy + i J c v dx 4- udy 

(7) 

~ f c (u + iv){dx + idy) 

From (6) and the known properties of real line integrals (Sec. 
12.4) or directly from the definition (2), it is easy to see that, 
when the same path of integration is used in each integral, we 
have 

j* dz = ~ Js /( 2 ) dz 

(8) 

f* &/(*) dz = k dz 

(9) 

j* [/(*) ± 0( z )] dz = J* /(z) dz ± J* g{z) dz 

(10) 

and, if P is a third point on the are AB, 
j* f (2) dz = f* f(z) dz + f* f(z) dz 



666 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


EXAMPLE 1 

If C is a circle of radius r and center z 0 , and if n is an integer, what is the value of 


For convenience, let us make the substitution z — zo = re ie , noting that 8 ranges from 0 1 
2ir as z ranges around the circle C (Fig. 14.9). Then dz = rie lB dd, and the integral becomes 


FIGURE 14.9 
The circle 
z — Zo = re' 9 . 



f 2r rieiB d0 _ i f 2r 

JQ r »+l e «(«+I)fl ~ r n JO 6 


If n = 0, this reduces to 

/"2s- 

i] Q =2 « 

On the other hand, if n ^ 0, we have 

— /* 2ir (cos n0 — i sin nd) dd = 0 
r n JO 

This is an important result to which we shall have occasion to refer from time to time. 


The form of the real line integrals in (6) suggests that Green's 
lemma (Theorem 1, Sec. 12.4) and the related results in Theorems 
5 and 6, Sec. 12.5, may be useful in studying line integration in 
the complex plane, and this is indeed the case. Hence, for ease of 
reference, we repeat this important material, appropriately 
specialized to the two-dimensional applications we now have in 
mind:* 


THEOREM 1 

If R is a region, either simply or multiply connected, whose boundary C is sec- 
tionally smooth and if P(x,y), Q(x,y), and ~ are continuous in and on the 


* To avoid confusion with u and v in the standard notation for a function of 
a complex variable, namely, /(z) — u + iv, we here use P and Q in place of 
the symbols U and V we used in Chap. 12. 



INTEGRATION IN THE COMPLEX PLANE 


boundary of R, then 

I p dx + Q d » - //* (tx ~ %) dx dy 

where the integration is taken around C in the positive direction with respect to 
the interior of R. 

THEOREM 2 

In any region where fP(x,y) dx + Q(x,y) dy is independent of the path, the 
partial derivatives of the function 

$(x,y) = fjj P(x,y) dx + Q(x,y) dy 
are ^ = P{x,y) and — = Q(x,y) 

THEOREM 3 

If “ = ~ at all points of a simply connected region R, then in R the integral 
$P(x,y) dx + Q(x,y) dy 
is independent of the path, and conversely. 

As a first application of Green’s lemma, we have Cauchy’s 
theorem, perhaps the most fundamental and far-reaching result 
in the theory of analytic functions : 


:gion, either simply nr multiply connected, whose bo 


THEOREM 

If R is a region, either simply Nbr multiply connected, whose boundary C is sec- 
tionally smooth and if f(z) is analytic and f(z) is continuous within and on the 
boundary of R, then 

f c /(«) dz = 0 

PROOF We begin by recalling from Eq. (6) that 

J c f(z)dz- j c u dx — v dy + i J c v dx 4- u dy 
Now the hypothesis that f(z) is continuous means that the partial derivatives 
“> “i —i ~ exist and are continuous throughout R. Hence, Green’s lemma can 
be applied to each of the line integrals on the right of the last expression, giving 


fc « 2) *-//«(- I - m) dx d « + *' Ih (s - |) dx dv 

d v neces 
i is analyt 
cally in 1 

Jc m dz = 0 


However, u and v necessarily satisfy the Cauchy-Riemann equations, since, by 
hypothesis, f(z) is analytic. Therefore, the integrand of each of the double integrals 
vanishes identically in R, leaving 


as asserted. 



668 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


FIGURE 14.10 
Contours which 
can be con- 
tinuously 
deformed into 
each other. 


The last theorem can be proved without making use of the 
hypothesis that/' ( 2 ) is continuous.* The French mathematician 
Edouard Goursat (1858-1936) was the first to do this, and in his 
honor the more general form of the result is usually referred to as 

the Cauchy-Goursat theorem. 




In particular, if / ( 2 ) is analytic in and on the boundary of the 
region R between two simple closed curves, we have, from the 
Cauchy-Goursat theorem, 

/ c ,/00* + /,/(*)<fc = o 

provided that each curve is traversed in the positive direction, as 
shown in Fig. 14.10a. On the other hand, if we reverse the direc- 
tion of integration around the inner curve Co and transpose the 
resultant integral, we obtain 

each integration now being performed in the counterclockwise 
sense, as shown in Fig. 14.106. Since there may be points in the 
interior of C 2 (which, of course, is not a part of R ) where f{z) is 
not analytic, we cannot assert that either of these integrals is 
aero. However, we have shown that they both have the same 
value. This result can be summarized in the highly important 
principle of the deformation of contours : 


THEOREM 5 

The line integral of an analytic function around any closed curve C\ is equal to the 
line integral of the same function around any other closed curve Ct into which 
Ci can be continuously deformed without passing through a point where /(g) is 
nonanalytic. 

If j(z) is analytic throughout a simply connected region R, 
then, according to the Cauchy-Goursat theorem, 

f c /(«) dz = 0 

for every simple closed curve C in R, But, as we saw in the dis- 
cussion which led to Theorem 6, Sec, 12.5, this implies that the 


* See, for example, E. G. Phillips, “Functions of a Complex Variable,” pp. 
89-92, Interscience Publishers, Inc., New York, 1945. 


SEC, 14,8 


INTEGRATION IN THE COMPLEX PLANE 


669 


line integral of /(s) between any two points A and B in R is 
independent of the path. On the other hand, in multiply connected 
regions this observation is not necessarily true, since two different 
paths joining A and B might form a closed path encircling one of 
the inner boundaries of It and there is no assurance that the 
integral of f(z) around such a path is zero. Thus, summarizing, 
we have the following theorem: 


THEOREM 6 

In any simply connected region where / (z) is analytic, the integral //(V) dz is 
independent of the path. 

Using Theorems 2 and 3 we can establish the following inter- 
esting result : 


THEOREM 7 

If u(x,y ) is a solution of Laplace’s equation in a region R, then in R there exists 
an analytic function having u as its real part, namely, / (z) ~ u + iv, where 
, N fx,v i hi j , du , 

v{x ’ v) = L -dy dx+ Tx dv 

and the path of integration from (a, b) to (x,y) lies entirely in R. 

PROOF Suppose first that R is simply connected. Then in R the integral 
defining v is independent of the path between the arbitrary fixed point ( a,b ) and 
the variable point ( x,y ) , since the condition for independence provided by Theorem 
3 is in this case 



which is true because of the hypothesis that u satisfies Laplace’s equation. The- 
orem 2 can, therefore, be applied to the integral which defines v, and we have 

dv _ du , dv _ du 

dx~ ~dy and dy ~ dx 

These are precisely the Cauehy-Riemann equations, which, if the derivatives are 
continuous, are the conditions that f(z) — u + iv be an analytic function. But 

^ and and hence ^ and — to which these are respectively equal, must 

c 

be continuous, since the second partial derivatives and are known to exist. 

Hence, if R is simply connected,/^) == u + iv is analytic, as asserted. 

On the other hand, if R is multiply connected, then, by the principle of the 
deformation of contours, the possible values of v differ at most by constants 
independent of the end points. And, clearly, a constant added to v will not affect 
the analyticity of u + iv. This completes the proof of the theorem. 

One of the most important consequences of Cauchy’s theo- 
rem is what is known as Cauchy’s integral formula : 


670 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


C'l 

^J/faEOREM 8 o 

If /(z) is analytic within and on the boundary C of a simply Connected region R 
whose boundary C is sectionally smooth and if zo is any point in the interior of R, 
then 



the integration around C being taken in the positive sense. 

PROOF Let Co be a circle with center at z 0 and radius p small enough so that 
C 0 lies entirely in R (Fig. 14.11). Now, by hypothesis, /(z) is analytic everywhere 


FIGURE 14.11 
The circle Co 
used in the proof 
of Cauchy’s 
integral formula. 



within R. Hence, the function f(z)/(z — z 0 ) is analytic everywhere within R 
except at the one point z = z 0 . In particular, it is analytic everywhere in the region 
R' between C and C 0 . Hence, by Theorem 5, the integral of this function around 
C is equal to its integral around Co. That is, 


(ID 


[ J&L & = / JSL dz = f } 

JG Z — Z 0 JCa Z — Z 0 JCo Z — Zq 


dz 




dz 

Z — Zq 


t r /(a) - /(g o) 

JCo Z — Zq 


dz 


By Example 1, the first integral on the right is equal to 2«. Hence, the assertion 
of the theorem will be established if we can show that the last integral vanishes. 
To do this, we observe that 


( 12 ) 



s f I/M -/(*»>! 

Jc, \z — 2 ( 1 1 


I dz\ 


On Co we have | z — z 0 1 = p. Moreover, since f(z) is analytic and hence continuous, 
it follows that, for any e > 0, there exists a 5 such that 


1/00 - /(z 0 )| < e provided \z - z 0 | s p < 8 

Choosing the radius p to be less than 5 and inserting these estimates in the right 
member of (12), we therefore have 

I [ / , ( 2 ) ~ /(* o) d \ < ft = f f \ dz \ * _ e 2irp = 2™ 

I Jc 0 Z — Zq | JCq p p JCo p 

Since the integral on the left is independent of e, yet cannot exceed 27re, which can 
be made arbitrarily small, it follows that the absolute value of the integral, and 


SEC. 14.8 INTEGRATION IN THE COMPLEX PLANE 

hence the integral itself, is zero. Thus, ( 11 ) reduces to 

f J&L , A 


o)2« + ° 

/(«, o) = ~ [ -&&-dz 
2wiJcoz — z 0 


as asserted. Cauchy’s integral formula is also true for multiply connected regions, 
but we shall leave as axTexerciseTHe^asy modification of our proof required to 
establish this fact. 


Find the values of , 


- dz if C is a circle of unit radius with center at (a ) z = i 


In (a) we think of the integral as written in the form 


and identify z 0 as i and/(z) as e*/(z + i). The function /(z) is analytic everywhere within and on 
the given circle of unit radius around z = i. (In fact, it is analytic everywhere except at z = —i.) 
Therefore, we can apply Cauchy’s integral formula, getting 


In (b) we identify zo as —i and /(z) as e‘/(z — i). Then Cauchy’s integral formula gives 
immediately 

t 6* dz .... . e~' . . . 


f e* dz 
JC z — i z + i 


— jr(cos 1 — i sin 1) 


From Cauchy’s integral formula, which expresses the value 
of an analytic function at an interior point of a region R in terms 
of its values on the boundary of the region, we can readily obtain 
an expression for the derivative of a function at an interior point 
of R in terms of the boundary values of-the function. In fact we 
have 


/'(go) = Umr 


/(g 0 + Ago) — f(z o) 
Ag 0 


= i im J_ fJL ( K g ) dz L [ tt 2) dz ] 

A20-0 Ago L 2 « JC z — (g 0 + Ago) 2 iri Jc z ~ z Q J 

~ Ago [2 iri lot® (^ z _ ( 2o _|_ j\Zq) z — go) 

= li m JL f /(g) dz 

Azo— »o 2 iri Jc (2 — g 0 — Ag 0 )(g — Zo) 

Taking for granted that the limit of the integral is equal to the 
integral of the limit in the last expression and letting Ag 0 — > 0 in 
the integrand, we obtain the desired result: 

\ _ 1 f /(g) dz 
J l oj 2 irt Jc ( 2 - 20) 2 



672 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


(13) 


That the limiting procedure is legitimate in this case can easily 
be established by showing that the absolute value of the difference 

1 f /(*) dz 3 r /(g) dz 

2 id Jc (z — 2y — Az a )(z — go) 2-r i Jc ( z — go) 2 

approaches zei'o as Az 0 — > 0. 

Continuing in the same way, we obtain the additional 
formulas 


/"(go) 

/"'(go) 


. 21 f m dz 

2iti Jc ( 2 — 2 0 ) 3 

= f /(g) dz 
2ir i Jc ( 2 — Zb) 4 


These results could all have been obtained formally by repeated 
differentiation of Cauchy’s integral formula with respect to the 
parameter z 0 - 

From the preceding discussion we conclude not only that an 
analytic function possesses derivatives of all orders but also that 
each derivative is itself analytic, since it, too, possesses a deriva- 
tive. This completes the proof of the following theorem: 


THEOREM 9 


If /(z) is analytic throughout a closed, simply connected region R, then, at any 
interior point z 0 of R, the derivatives of /(a) of all orders exist and are analytic. 
Moreover, 


go) 


n 1 
2 id 


i Ic 


/(g) dz 
(g - go)” + 


where C is the boundary of R. 


It is interesting to note that functions of a real variable do 
not in general possess the derivative properties described by 
Theorem 9, for, at particular points, a function of a real variable 
may possess one or more derivatives without the derivatives of 
all orders existing. For instance, at the origin the function x 7/i 
possesses a first and a second derivative but no derivatives of 
higher order. 

Using Theorem 9, we can now prove the converse of Cauchy’s 
theorem, which is known as Morera’s theorem :* 


THEOREM 10 

If f(z) is continuous in a region R and if J c f(z) dz — 0 for every simple closed 
curve which can be drawn in R, then /(z) is analytic in R. 

PROOF To prove this, we observe, as in the proof of Theorem 6, Bee. 12.5, that, 
if the line integral of f{z) around every closed curve in R is zero, then the line 
integral of / (z) between a fixed point z 0 and a variable point z in R is independent 


Named for the Italian mathematician Giacinto Morera (1856-1909). 



SEC. 14.8 INTEGRATION IN THE COMPLEX PLANE 673 

of the path and, hence, is a function of z alone, say 
F(z) - Jj(z) dz 

If we let f(z) = u + w and F(z) — U + iV, this can be written 

F(z) = U + iV = udx — vdy + i v dx + u dy 
or, equating real and imaginary parts, 

U = u dx — v dy and V = v dx + u dy 

By Theorem 2, each of these integrals can be differentiated partially with respect 
to x and y, and we find 

dU au dV ev 

dx u dy V dx V dy u 
From these it is obvious that 

su = dv , au = _ dV 

dx dy an dy dx 

or, in other words, that U and V, satisfy the Cauchy-Riemann equations. More- 
over, since u and v are continuous, because of the hypothesis that / (2) = u 4 - iv 

is continuous, it follows that-^— > — > — > — - are continuous. Hence, F(z) = U 4 -iV 
’ dx dy dx dy ’ v ' 

is an analytic function whose derivative, in fact, is 

wt ^ dll . . dV . . . 

F(z) ~ + -“ + »-/« 

Being the derivative of an analytic function, /(z) is therefore analytic, by Theorem 
9, as asserted. 

Beginning with the formula for / (n) (z 0 ) provided by Theorem 
9, we can now establish what is known as Cauchy’s inequality : 

THEOREM 11 

If / (z) is analytic within and on a circle of radius r with center at z 0 , then 


where M is the maximum value of |/(z) | on C. 
PROOF From Theorem 9, we have 

I «-)(*.) 1 = I *L f -.MJz 1 

IJ ^ | 2t4 Jc ( z - z Q ) n+1 1 

< n! f |/(g)l \dz\ 

~~ 2ir Jc | z — z 0 | n+1 
n\ M r , , , 

“ 2t r n+1 Jc ^ 
n\ M 0 
2 tt ‘ r n+1 2vr 
n\M 


as asserted. 


674 


ANALYTIC FUNCTIONS OF A COMPLEX VARIABLE 


CHAP. 14 


For the special case n — 0, Cauchy’s inequality becomes 
l/(*o)| £ M 

which shows that, on every circle around z 0 , no matter how small, 
|/(2) j has a maximum value M which is at least as great as f(z 0 ) . 
In other words, we have the following result, usually referred to 
as the maximum modulus theorem : 

THEOREM 12 

The absolute value of a function f(z) cannot have a maximum at any point where 
the function is analytic. 

EXERCISES 

1 Evaluate j ^ + z ! dz, (a) along the line y = x/'i, (b) along the real axis to 3 and then verti- 
cally to 3 + i, and (c) along the imaginary axis to i and then horizontally to 3 + i. 

f3+i 

2 Evaluate / (i) 2 dz along each of the paths used in Exercise 1. 

rl+i 

3 Evaluate J (i 2 + iy) dz along the paths y = x and y — x 2 . 

fl+i 

4 Obtain an upper bound for the absolute value of the integral / e~ sl dz, (a) along y = x, 
(b) along y = x 2 , and (c) along the real axis to 1 and then vertically to 1 + i. 

5 Obtain an upper bound for the absolute value of the integral ~ J ■■ dz taken around 

the circle [ 3 } = 3. What is the value of this integral if the path of integration is the circle 

1*1 = 3i? 

6 What is the value of ^ C is the circle I* + 1| =»• 1? (b) if C is the 

ellipse x 2 + 2 y % = 8? .(c) if C is the circle \z + i\ = 1? 

7 What is the value of [ * ■ dz (a) if C is the circle |z| = 1? (b) if C is the circle 

jc z z "t 2z + 5 

I* + 1 — i[ = 2? (c) if C is the circle jz -f- 1 -f i| = 2? 

8 What is the value of / - — ~~~ dz around the circle \z — 1| =3? 

J (2 + l) 2 

9 What is the value of J - 2 £j~— dz, (a) around the circle |z| = 1? (b) around the circle 

\z — 2 — t'! —2? and (c) around the circle |z — 1 — 2i| = 2? 

10 Show that Cauchy’s integral formula is valid in multiply connected regions. 

11 If u(x,y) is harmonic, i.e., satisfies Laplace’s equation, within the closed region bounded by 
a circle C, prove that the maximum value of u(x,y) in R always occurs on C and not in the 
interior of R. [Hint: Apply the maximum modulus theorem to the function e Hz \ where /(z) 
is the analytic function having u(x,y) as its real part.] 

12 Using Theorem 9, show that 

I" 1 f e** 

where C is any simple closed curve encircling the origin. 

13 Observing that the result of Exercise 12 can be written 

( 


wl) 2tri jc nlz n+1 dZ 



SEC. 14.8 


INTEGRATION IN THE COMPLEX PLANE 


675 


prove that 



14 Complete the proof of Theorem 9 by showing that the absolute value of the difference (13) 
approaches zero as Az 0 approaches zero. 

15 a Taking C to be the circle defined by z — Re i6 and letting 2 0 = re^ (r < R), show that 
Cauchy’s integral formula becomes 


/(re*'*) > 


-1 f 

2t rt JC 


f(Re i9 ) 


b Show also that 


A f 

2tt i JC 


f(Re i9 ) 
R& 9 e* 


c Finally, by subtracting these two integrals and equating real parts in the resulting 
equation, obtain Poisson’s formula: 

„(,,*> . 2 r — — d$ 

' ' ' 2t Jo R> - 2 Rr cos («-*)+ r* 



CHAPTER FIFTEEN 


Infinite Series 
in the 

Complex Plane 

15.1 

Series of complex terms 

Most of the definitions and theorems relating to infinite series of 
real terms can be applied with little or no change to series whose 
terms are complex. To restate these briefly, let 
(1) fi(z) +/e(2) + fa(z) + ’ * ' + fn{z) + ' ‘ ‘ 

be a series whose terms are functions of the complex variable z. 
Then the partial sums of this series are defined as the finite sums 

St(z) = m 
&(*) = fi(z) +/»(*) 


0.0 = /i(») +/»w + • • • +/.(*) 

The series (1) is said to converge to the sum S(z) in a region R 
provided that, for all values of z in R, the limit of the nth partial 
sum S n (z) as n becomes infinite is S(z). 

According to the technical definition of a limit, this requires 
that, for any e > 0, there should exist an integer N, depending in 
general on c and on the particular value of z under consideration, 
such that 

\S(z) - S n (z)\ < e for all n> N 

The difference S(z) — S n (z) is evidently just the remainder after 
n terms R n (z) ; thus, the definition of convergence requires that 
the limit of |f2 tl (z)[, as n becomes infinite, should be zero. A series 
which has a sum, as just defined, is said to be convergent, and the 
set of all values of z for which it converges is called the region of 
convergence of the series. A series -which is not convergent is said 
to be divergent. If the absolute values of the terms in (1) form a 
convergent series 

!/.(»)! + I/.WI + I/.MI + • • • + I/.MI + • • • 


SEC. 15.1 


SERIES OF COMPLEX TERMS 


677 


then (1) is said to be absolutely convergent. If the series (1) con- 
verges but is not absolutely convergent, it is said to be condi- 
tionally convergent. Absolute convergence is an important 
property because it is a sufficient though not necessary condition 
for ordinary convergence. Moreover, the terms of an absolutely 
convergent series can be rearranged in any manner whatsoever 
without affecting the sum of the series, whereas rearranging the 
terms of a conditionally convergent series may alter the sum of 
the series or even cause the series to diverge. From the definition 
of convergence it is easy to prove the following theorem: 

THEOREM 1 

A necessary and sufficient condition that the series of complex terms 

fi(z) +■/»(*) +AOO + • ■ * +/,(*) + • • • 

converge is that the series of the real parts and the series of the imaginary parts of 
these terms each converge. Moreover, if 

jjj &(/„) and £ 6 (f n ) 

converge to the respective functions R(z) and I(z), then the given series converges 
to R(z) + il(z). 

Of all the tests for the convergence of infinite series, the most 
useful is probably the familiar ratio test, which applies to series 
whose terms are complex as well as to series whose terms are real : 


THEOREM 2 

For the series 

/i(«) + /«(*) +/»(*) + • * • +/»(*> + * • ‘ 
let “ |r(s) l 

Then the given series converges absolutely for those values of z for which 0 ^ 
|r(z)| < 1 and diverges for those values of z for which |r( 2 ) | > 1. The values of z 
for which \r(z)\ — 1 form the boundary of the region of convergence of the series, 
and at these points the ratio test provides no information about the convergence 
or divergence of the series. 


EXAMPLE 1 

Find the region of convergence of the series 




Applying the ratio test, we find 


/n+l(g) 

/»(*) 



n 2 z + 1 



678 


INFINITE SERIES IN THE COMPLEX PLANS 


CHAP. IS 


1 2 + 1 I 

. Hence, the values of z for which the series 

surely converges are those in the region defined by the inequality 

that is, by | z + 1| < \z — 1| 


Now \z + lj is just the distance from z to the point —1, and | z — lj is just the distance from z to 
the point 1. Hence, z is restricted to be nearer to the point —1 than to the point 1. In other 
words, z must lie in the left half of the complex plane. The boundary cases for which the test fails 
are the values of z which are equidistant from — 1 and 1, that is, the values of z on the imaginary 
axis. But, for these points, the related series of absolute values is the convergent real series 


2 a 3 s 4® 


Hence, for all values of z on the imaginary axis, the given series, being absolutely convergent, is 
convergent; therefore, these points also belong to the region of convergence. 


The sum or difference of two convergent series can be found 
by term-by-term addition or subtraction of the series. If two 
series converge absolutely, their product can be found by multi- 
plying the series together as though they were polynomials. To 
establish conditions under which series can legitimately be inte- 
grated or differentiated term by term, however, the concept of 
uniform convergence is required: 


DEFINITION 1 

A series of functions is said to converge uniformly to the function S(z) in a region 
R, either open or closed, if corresponding to an arbitrary e > 0 there exists a 
positive integer N, depending on e but not on z, such that for every value of z in R 

|S(z) - &(*)! < e for all n > IV 

In other words, if a series converges uniformly in a region R, then, 
corresponding to any e > 0, there exists an integer N such that 
everywhere in R the sum of the series S(z) can be approximated 
with an error less than e by using no more than N terms of the 
series. It may well be that fewer than N terms will suffice at 
most of the points of the region, but nowhere will more than N 
be required. This is in sharp contrast to ordinary convergence; 
for, in the neighborhood of certain points in a region of ordinary 
convergence, it may be that no limit can be set on the number 
of terms required to secure a prescribed degree of accuracy. 


EXAMPLE 2 

Discuss the convergence of the series 

Z 2 2 2 

z J +-— 1 f + - — 5 + . . . 

1 + Z 2 (1+ z*)* ^ (1 + Z*)’ T 

in the 90° sector bounded by the right halves of the lines y = ±x (Fig. 15.1a). 



The given series is a geometric progression which will converge for all values of z for which 
the absolute value of the common ratio, i.e., 

"-irbi 

is less than 1. Now the angle of z is restricted, by hypothesis, to be between — x/4andx/4; hence, 
the angle of z- must be between —w/2 and x/2. Therefore (Fig. 15.16), for every z in the given 
region R, we have 

|1 + 2 s | £ 1 and ; - j £ 1 
|1 + 2 2 | 

and the equality signs hold only for the value 2 = 0. Thus, the given series converges for all 
values of z in R, and its sum is 

*.)- {rb - i_i/a+ 2 -> _1+ ‘ ! 

( 0 + 0 + 0 + 0 +--- =0 2=0 
Now let. an arbitrary e > 0 be given, and let us attempt to determine how many terms of the 
series must be taken in order that 

I S(z) - S„( 2 )| < £ 

This difference, i.e., the remainder after n terms of the series, is just the geometric progression 


(1 + 2 2 )” (1 + 2 S )" +1 


2=0 

Hence, our task is to find, if possible, a value of N such that 


( 2 ) 


\R«(Z)\ = 7 


1 


for all n > N and all z in R 


|1 + 2 2 | a 1 

Now |1 + 2 2 | ^ 1 + | 2 2 | = 1 + |z| 2 . Hence, overestimating the denominator of |.K«(z)|, we have 

From this inequality we observe that if it should be impossible to find an integer N such that 


(i + W 2 )- 1 


(3) 


for all n > N and all z in R 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


then surely it will be impossible to find an integer N which will suffice to keep 

m*)\ < * 

everywhere in R. And this is indeed the case, for if we attempt to solve the inequality (3) for 
by obvious steps we find 


( n ~ 1) In (1 d- |z| 2 ) > In - = — In e 


In (1 + |z| 2 ) 

But, for values of z within the sector of the problem and sufficiently close to the origin, In (1 + |z| 2 ) 
can be made arbitrarily close to In 1, that is, zero. Hence, n is unbounded, and there exists no 
integer N for which (3) holds. Since |/?„(2)| is larger than the fraction in (3), it is clear that the 
fundamental requirement of uniform convergence (2) cannot be fulfilled. Hence, the convergence 
of the given series in the original region is nonuniform. 

On the other hand, if we restrict z to the infinite region R', bounded by the given rays and a 
circular arc of small but fixed radius S, as shown in Fig. 15.1c, the series converges uniformly. In 
fact., the law of cosines applied to Fig. 15.15 gives 


>r, reducing the right-hand side by dropping the last term, which is surely nonnegative, since 
~ 7 r /2 g arg z 2 < t/2, 


Hence, underestimating the denominator of |l?„(z)j, we can write 


From this it is clear that, if we can find an integer N such that 


for all n > N and all z in R' 


then surely for the same N we shall have 


I &,(*)! < « for all n > N and all » in R' 
e attempt to solve the inequality in (4) for «: 


- In (1 + | 2 | 4 ) > In - = — In e 


The most unfavorable case, i.e., the largest possible value of the fraction on the right, occurs 
when | z\ is as small as possible. But, in the modified region we are now considering, the smallest 
possible value of |z| is S, which yields 



SEC. 15.1 


SERIES OF COMPLEX TERMS 


681 


If we choose N to be the first integer equal to or greater than the expression on the right, then 
(4) will surely hold. But,, as we observed above, if (4) is satisfied, so too is (5), and, hence, in the 
modified region R' the given series converges uniformly. 

Usually uniform convergence is established not by a direct 
application of the definition, as in Example 2, but by the so-called 
Weierstrass M test:* 


THEOREM 3 

If a sequence of positive constants j M n \ exists such that \f u (z)\ S M n for all 
positive integers n and for all values of 2 in a given region R and if the series 

iWi + M 2 + Mi + • • • + M n + • • • 

is convergent, then the series 

fl 00 +/*(«) +/.(*) + • • ‘ +/.(*)+'• * • 

converges uniformly in R. 

PROOF To prove this, we must show that for any e > 0 there exists an integer 
N independent of 2 , such that for all values of 2 in R the absolute value of the 
remainder after n terms in the series of the/’ s is less than e whenever n exceeds N. 
To do this, we note that 

|A»(*)| = |/»+l(z) + /,+.(*) + • • 'I 

= |/n+l(^)| + \fn+2(z)\ + ' ' • 

(6) ^ M n + 1 + M n + 2 + ' ’ ' 

The last expression is just the remainder after n terms of the series of the M’s. 
Since this series is convergent, by hypothesis, it follows that, for every e > 0, 
there exists an N such that this remainder is less than € for all n > N. This value 
of N, arising as it does from a series of constants, is obviously independent of 2 . 
Moreover, from the inequality (6) it is clear that whenever n exceeds this 2V, 
|JB„(*)| < e for all values of 2 in R. Hence, the series of the /’s is uniformly con- 
vergent, as asserted. Incidentally, this theorem implies a comparison test which 
proves that the series of the f s is also absolutely convergent. 

The Weierstrass M test is merely a sufficient test; that is, 
there exist uniformly convergent series whose terms cannot be 
dominated by the respective terms of any convergent series of 
positive constants, f The M test suffices for almost all applica- 
tions, however. 

One useful property of uniformly convergent series is con- 
tained in the following theorem: 

THEOREM 4 

If the terms of a uniformly convergent series are multiplied by any bounded 
function of z, the resulting series will also converge uniformly. 


* Karl Weierstrass (1815-1897), a German mathematician, is often called 
the “father of modern rigor.” 

t One example of such a series will be found in Exercise 6. 



682 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


PROOF Let R be the region of uniform convergence of the series 
/i(s) +Hz) + /i(*) + ■ • ' +/„(*) + • • * 
and suppose that throughout R we have 
lfir(2)| ^ M 

Now, since the series of the/’s converges uniformly, it follows that, corresponding 
to the infinitesimal e/M, there exists an integer N such that 


Hence, 


\g(z)fn+l(z) + g(2)fn+ i(x) + 


* M\fn + &) +fn + *(z) +••■•! 


= e for all n > N and all z in R 
But this is precisely the condition that the product series 

g{z)fi{z) + g(z)M.z) + g(z)Mz) + • * * + g(z)f n (z) + ■ ■ * 
be uniformly convergent. 

One important consequence of uniform convergence is 
embodied in the following theorem: 


THEOREM 5 

The sum of a uniformly convergent series of continuous functions is a continuous 
function. 

PROOF Let 

/(*) = fl(z) +M») +/,(*) + • • • +fn(z) + * * * » SJZ) + Rn(z) 

be a uniformly convergent series in which each term is a continuous function of z, 
and let e/3 be an arbitrary infinitesimal. Then, since the series converges uni- 
formly, an integer N exists such that 

|tf n (z)| < | for all n > N 

and for all values of z in the region of uniform convergence. In particular, if Aiz 
is any increment such that z + Aiz is still in the region of uniform convergence, 
we also have 

|#»(z + A,*) | <| for all n> N 

Moreover, since each term of the given series is a continuous function and since 
any finite sum of continuous functions is necessarily continuous, it follows that 
S„(z) is continuous and, hence, that there exists an increment A 2 z such that 

|& n (z + Az) — S„(z) | for all Az’s for which |Az| < |A 2 z| 



SEC. 15.1 


SERIES OF COMPLEX TERMS 


683 


Now 

\f(z 4 A z) - f(z) | = |[5n(* 4 Az) 4 R n (z + A z)] - [S.C0 4- /2»(«)]| 
£ |S a ( 2 + As) - S n (z ) | + \R n (z + A*)| + \R n (z)\ 

Hence, for all As’s whose absolute values are less than the smaller of the quan- 
tities |Ai 2 | and |A 2 z|, it follows that 


|/(«4Aa)-/(*)|<|4|4| = 6 
which is precisely what we mean by saying that f(z) is continuous. 


Theorem 5 makes no assertion about the sum of a series of 
continuous functions if the convergence is nonuniform. However, 
specific examples make it clear that in such cases the sum need 
not be continuous. For instance, Example 2, in which we found 
the sum of the series 


" !+ 142 2+ (14 z 2 ) 2 


(1 4 z 2 ) 3 


to be f(z) 


| 1 4 z 2 2^0 
( 0 2 = 0 


shows that the limit of a sum of continuous functions may be 
discontinuous if the convergence is nonuniform. In fact, in the 
neighborhood of z — 0, w r here the convergence is nonuniform, 
the sum jumps abruptly from 1 4 z 2 to 0, even though every 
term of the series is a continuous function of z for all values of z 
except 2 = ±i. 

One of the most important properties of uniformly conver- 
gent series is given by the following theorem: 


THEOREM 6 

The integral of the sum of a uniformly convergent series of continuous functions 
along any curve C lying entirely in the region of uniform convergence can be 
found by term-by-term integration of the series. Moreover, if each term of the 
series is analytic, so, too, is the sum. 

PROOF Let the given series be 

f(z) = fi(z) 4 / 2 ( 2 ) 4 ■ • • 4 /•(*)■+ • • ’ 

Then, to establish the theorem we must show that 

f c /(*) dz = f c fj(z) dz 4 f c h(z)dz+ • • • 4 f c U*)dx+ • ■ • 

which, in accordance with the usual definition of convergence, requires that we 
prove the existence, for every e > 0, of an integer N such that 

| f c f(z) dz — ^ f c fi(z ) dz | < e 


for all n > N 


684 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


Now for any finite sum it is true that the integral of a sum is equal to the sum of 
the integrals. Hence, the left member of the last inequality can be written 

| j c f(z) dz - f c 2 /;(*) dz \=\f G [/( z ) “ t ^ ] dz \^\fc Rn ^ dz I 

Let L be the length of the path of integration. Then, from the uniform con- 
vergence of the given series, we know that there exists an integer N such that 

\R n (z)\ < jr for all n > N 

and for all z’s in the region of uniform convergence, in particular for all values 
of z on the path of integration C. If n > N, we can therefore write 

| f c /(*) dz ~ f, f c /<(«) *1 = 1 fc RM dz | s f c !*.MI 1*1 

< X fc 1*1 " X 1 “ £ 

which establishes the first part of the theorem. 

To establish the second part, we suppose that the region of uniform con- 
vergence R is either simply connected or has been made simply connected by 
suitable cross cuts. Then, if each term /< is analytic in R, it follows from Cauchy’s 
theorem that the integral of each term around any simple closed curve in R (or 
its simply connected modification) is zero. Hence, the integral of the sum f(z) 
around any closed curve is zero, and, thus, by Morera’s theorem, f(z) is analytic. 
This completes the proof of the theorem. 

The companion result on the term-by-term differentiation 
of series is contained in the following theorem : 


THEOREM 7 

If f(z) is the sum of a uniformly convergent series of analytic functions, then the 
derivative of f(z) at any interior point of the region of uniform convergence can 
be found by term-by-term differentiation of the series. 


PROOF Let z be a general point of the region of uniform convergence R, and 
let C be a simple closed curve drawn around z in R. If we write the given series as 


f(t) — fi(t) -{■ f 2 (t) + • • • + ,f n (t) + ' • ' 

where t stands for any of the values of z on C, we can multiply by the bounded 
function 

1 

2t n(t — z) z 

and, by Theorem 4, the resulting series 


m __ m) ut) 

2 vi(t - 2 ) 2 2? ri(t — 2 ) 2 ~ 1 ~ 2iri(f — 2 ) 2 



+ • • • + 


SEC. 15.1 


SERIES OF COMPLEX TERMS 


685 


will also converge uniformly. By Theorem 6, it ean, therefore, be integrated term 
by term around C, giving 

_L f M dt = I / hit) dt , J_ r f 2 (t) dt ■ , 

2td Jc ( t - z) 2 2ici Jc ( t — z) 2 _t "’ 2 id Jc ( t — z) 2 


7 (t - z) 2 2 id Jc (t - 2)2 

+ J _ f /»(0 * .... 

2« JC (t - 2)2 " r 

But these integrals, by the first generalization of Cauchy’s formula (Theorem 9, 
See. 14.8), are precisely the derivatives of the respective terms of the given series 
at the point z. Hence, 

Hz) = Hz) +Hz) + '••:+£(*)+•'•• 
which establishes the theorem. 


It is interesting and important to note that Theorem 7 does 
not apply to series of functions of the real variable x. To justify 
term-by-term differentiation of such series, we require not uni- 
form convergence of the original series, but rather uniform con- 
vergence of the series resulting from the term-by-term differen- 
tiation. More precisely, we have the following theorem, which 
is proved in most texts on advanced calculus :* 


THEOREM 8 

If fix) = Mx) + /,(*) + Mx) +•■■•+ /.(*) + •• • 

is a convergent series of functions of the real variable x, each of which possesses 
a continuous first derivative, then f(x) can be found by term-by-term differen- 
tiation, provided the series' of the derivatives is uniformly convergent. 

EXERCISES 

1 Find the region of convergence of the series 

1 + (z - i) + (* ~i) 2 4- (2 - i) 3 + 

2 Find the region of convergence of the series 

1 1 1 1 

2 (z + i) + 2 2 (2 + »)* + 22(2 + i) 3 + 2 \z + i ) 4 + * ' ’ 

3 Find the region of convergence of the series 

1 + *{r+i) + h(r+ L i) + ?(ym) + ' ' ' 

4 Show that the entire region of convergence of the series of Example 2 consists of the exterior 
of the lemniscate (z 3 — y 3 + l) 3 + 4 xhf- = 1 together with the origin. 

6 Show that the series x + x(I — x) + z(l — x) 2 + x{l— x) 3 + • - • converges for 
0 § x < 2, but that the convergence is nonuniform in any subinterval which contains the 
origin. 


* See, for instance, A. E. Taylor, “Advanced Calculus,” p. 602, Ginn and 
Company, Boston, 1955. 


686 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


6 Show that the series 

1 + i* ” 2 + x 3 + 3"+** ~ 4 + x 2 + 

converges uniformly over any interval of the x-axis, but that this cannot be established by 
the Weierstrass M test. 

7 Show that the series 

£ j £ + — £_ 1 £ h • • ■ 

(0 • * + l)(z + 1) (* + l)(2z + 1) (2* + l)(3z + 1) (3* + l)(4z + 1) 

converges to 0 if z = 0 and to 1 if z 7 * 0. Show that the convergence is nonuniform in the 
neighborhood of the origin, but uniform in the exterior of any circle with center at the origin. 

8 What is the region of convergence of the series ^ “? Where does the series converge 

n=i n 

uniformly? Show that the series V converges uniformly over any interval of the z-axis, 

but that it cannot be differentiated term by term for any value of x. Explain. 

9 Can the sum of a nonuniformiy convergent series of continuous functions be continuous? 
10 Prove Theorem 1. 


15.2 

Taylor’s expansion 

Very often the series with which one has to deal in applications 
are those which are studied formally in elementary calculus 
under the name of Taylor’s series. Their systematic study begins 
with Taylor’s theorem :* 


THEOREM 1 

If f(z) is analytic throughout the region bounded by a simple closed curve C 
and if 2 and a are both interior to C, then 


/0) = f(a) + f(a) {z - a) + /"(a) 
where R n - — f T — - 

2 n Jc (t — 


<g ~ a) 2 
2! 

f(t) dt 


' a) n (t - z) 

PROOF We first note that Cauchy’s integral formula can be written 

1 


f(z )=± = ± [ML. 

2tz Jc t — z 2iri Jc t — a 1 — (2 — 


a)/(t ~ a) J 


Then, from this, applying the identity 
— -■! + « + «* + 


Named for the English mathematician Brook Taylor (1685-1731). 



SEC. 1 5.2 


TAYLOR’S EXPANSION 


687 


to the factor z — 

1 - (z- a)/(t - 

in the last integral, we have 


MNr 


{z - a) n /jt - ffl) w 


i ( z ~ a V~ 

V “ a ) T 1 - (* - «)/(* - a)J 

= J_ f fit) dt 2 - a r fjt) dt 
2« Jc t — a 2m Jc (t — a) 2 


(g - a)*~ 


2m’ 


7 


/(f) * (z - a) n r /(f) df 

■>" O..J- J c _ a)B ( f _ *) 


fc it - a)“ 


From the generalizations (Theorem 9, See. 14.8) of Cauchy’s integral formula 
it is evident that, except for the necessary factorials, the first n integrals in the 
last expression are precisely the corresponding derivatives of f(z) evaluated at 
the point z — a. Hence, 


/(*) = fid) + /'(«)(« - a) +■■ 


which establishes the theorem. 


+ P n ' 1) ia) 

. (g - <0* 
_r 2m 


iz - a) n ~ x 
in- 1)! 
f /(0 dt 
Jc {t — a) n it — z) 


By Taylor’s series we mean the infinite expansion suggested 
by the last theorem, namely, 


fiz) ~fia) + f'ia ) (z - a) + /"(a) - - 2 ,~ -- + ■ • * 

+t ( ’- u M^0W + 


To show that this series actually converges to fiz), we must show, 
as usual, that the absolute value of the difference between /(z) 
and the sum of the first n terms of the series approaches zero 
as n becomes infinite. From Taylor’s theorem it is evident that 
this difference is 


Bniz) = 


(g - fl) n 

2m 


f /(f) dt 
Jc it — a) n it — z) 


Accordingly, we must determine the values of z for which the 
absolute value of this integral approaches zero as n becomes 
infinite. 

To do this, let C x and C 2 be two circles of radii r x and r 2 
having their centers at the point a and lying entirely in the 
interior of C (Fig. 15.2). Since fiz) is analytic throughout the 
interior of C, the entire integrand of R„iz) is analytic in 
the region between C and C 2 , provided that z, like a, lies in the 
interior of C 2 . Under these conditions, the integral around C 
can be replaced by the integral around C 2 . If, in addition, z is 



688 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 



Interior to C'i, then for all values of t on C 2 (the t's which are 
now involved in the integration) we have 

|2 — «| = r 2 
\z - a\ < r x 
|/ - z\ > r t - t, 
and [/(f) | 2s M 


where M is the maximum of j/(z) j on C 3 . Hence, overestimating 
factoid in the numerator and underestimating factors in the 
denominator, we have 


l*.GOI - 


[ ( g ~ «)" f /(*) dt I 

I 2iri Jc ( t — a) n {t — z ) | 

N ~ al n f 1/(01 N 

|2irf| Jc |g — a\ n \t — z\ 


< r S.[ M \ dt \ 

2 7r Jc r 2 ”(r 2 — n) 


_ n*M 

’ 27rr 2 n (r 2 - ri) 


27Tf 2 



Since 0 < ri < r 2) the fraction (ri/r 2 ) n approaches zero as n 
becomes infinite; therefore, the limit of RJz) is zero. Thus we 
have established the following important theorem : 

THEOREM 2 
Taylor’s series, 


m = m +/'(«)(* - a) +n<») -- 2f ? - +/"'(«) + • ■ • 

is a valid representation of /(z) at all points in the interior of any circle having 
its center at a and within which /(z) is analytic. 

The largest circle which can be drawn around z — a such that 
f(z ) is analytic everywhere in its interior is called the circle of 
convergence of the Taylor's series of /(z) about the point z = a. 



SEC. 15.2 


TAYLOR’S EXPANSION 


689 


The radius of this circle is called the radius of convergence of the 
series. Of course, this entire discussion applies without change to 
the case a = 0, which is usually called Maclaurin’s series.* 

The preceding discussion established a circular region around 
the point z — a within which the Taylor’s series of /(z) converges 
to /(z). However, it did not provide any information about the 
behavior of the series outside the circle of convergence. Actually, 
the Taylor’s series of /(z) converges only within and possibly 
on the circle of convergence, and diverges everywhere outside 
this circle, as the following two theorems make clear: 

THEOREM 3 
If the power series 

a 0 + di(z — a) + a 2 (z — a) 2 + a 3 (z — a) 3 + • • • 
converges for z = z h it converges absolutely for all values of z such that \z — a\ < 
\zi — a | and uniformly for all values of z such that \z — a\ ^ r < \zi — a\. More- 
over, the sum to which it converges is analytic. 


PROOF Since the given series converges when z = Z\, it follows that the terms 
of the series are bounded for this value of z. That is, there exists a positive con- 
stant M such that 

|a„(zi — a) n \ — |a„j |zi — a| n M for n — 0, 1, 2, . . . 

Now let Zo be any value of z such that 


|z 0 — o| < |z 1 — a\ 


that is, let z<> be any point nearer to a than Z\ is. Then, for the general term of the 
series when z = Zo, we have 

- «)*l - l«.l l«. - «l* = |o.i Ni - ol" I r a M I I" 

I »1 j j a 1 to j 

If we set 


( 1 ) 



where k is obviously less than 1, this shows that the absolute values of the terms 
of the series 


(2) a 0 + ai(z 0 — a) + a 2 (z 0 — a) 2 + a 3 (z 0 - a) 3 + * • • 

are dominated, respectively, by the terms of the series of positive constants 

(3) M + Mk + Mk 2 + Mb 3 + • • • 

This is a geometric series whose common ratio k is numerically less than 1. It 
therefore converges and, hence, provides a comparison test which establishes 
the absolute convergence of the given series (2). 

Unfortunately, the series (3) does not provide a test series which can be 


* Named for the Scottish mathematician Colin Maclaurin (1698-1746), 
although another Scottish mathematician, 'James Stirling (1692-1770), 
anticipated by 25 years Maclaurin’s use of this result. 


690 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


used in applying the Weierstrass M test to the series (2) , because it is clear from 
(1) that the terms of the series (3) depend on z 0 . However, for values of z n such 
that 

(4) |zo — o| ^ r < \zi — a| 

, , I Zo — a [ ^ r . 

we have k = S -r : = X 

1 z\ - a ! ~ \zi — a| 

and X is clearly a positive constant less than 1 which is independent of z 0 . Hence, 
for all values of z 0 satisfying the condition (4), the series (1) is dominated term 
by term by the convergent geometric series of positive constants 

M + MX + MX* + MX 3 + • • • 

and, therefore, by Theorem 3, Sec. 15.1, the series (2) is uniformly convergent. 

Finally, since each term a»(z — a) n is an analytic function and since any 
point in the interior of the circle \z — a| = jzi — a\ can be included within a 
circle of the form \z — a| = r < \z\ — a\, it follows from the second part of 
Theorem 6, Sec. 15.1, that, within the circle | z — a| ~ \zt — a|, the function 
to which the series converges is analytic. 

Now, let a be the singular point of f(z) nearest to the center 
of the expansion z — a, and suppose that the Taylor’s series for 
f(z) converges for some value z — z\ farther from a than a is. 
By the last theorem, the series must converge at all points nearer 
to a than z\ is, and, moreover, the sum must be analytic at every 
such point. But this clearly contradicts the hypothesis that a is 
a singular point of/(z), and, thus, we have established the follow- 
ing theorem: 


THEOREM 4 

It is impossible for the Taylor’s series of a function /(z) to converge outside the 
circle whose center is the point of expansion z — a and whose radius is the dis- 
tance from a to the nearest singular point of f(z). 

The notion of the circle of convergence is often useful in 
determining the interval of convergence of a series arising as the 
expansion of a function of a real variable. To illustrate, consider 

f{z) = j-~ = 1 - z 2 + z 4 - z 6 + • ■ • 

This will converge throughout the interior of the largest circle 
around the origin in which /(z) is analytic. Now, by inspection, 
f(z) is undefined at z = ±i, and even though one may be con- 
cerned solely with real values of z [for which 1/(1 + x 2 ) is every- 
where infinitely differentiable], these singularities in the complex 
plane set an inescapable limit "to the interval of convergence on 
the .r-axis. We can, in fact, have convergence around x — a on 
the real axis only over the horizontal diameter of the circle of 
convergence in the complex plane. 

As an application of Taylor’s expansion, we shall conclude 


SEC. 15.2 


TAYLOR’S EXPANSION 


691 


this section by establishing the simple but important result 
known as the theorem of Liouville :* 


THEOREM 5 

If f(z) is bounded and analytic for all values of z, then f(z) is a constant. 

PROOF To prove this, we observe first that since /(z) is everywhere analytic, 
it possesses a power series expansion around the origin 

m =/( o) +f(o)z+ • • • +£^z» + . • . 

nl 

which converges and represents it for all values of z. Now, if C is any circle having 
the origin as center, it follows from Cauchy’s inequality (Theorem 11, Sec. 14.8) 
that 

I/‘">(0)| a ^ 

where M c is the maximum value of |/(z)| on C and r is the radius of C. Hence, for 
the coefficient of z n in the expansion of f(z), we have 

| / u (0) 1 

| n\ j ~ r n ~ r n 

where M, the bound on J/(z)| for all values of z, which exists by hypothesis, is 
independent of r. Since r can be taken arbitrarily large, it follows, therefore, that 
the coefficient of z" is zero forn = 1, 2, 3, . . . . In other words, for all values of z , 

m - /( o) 

which proves the theorem. 

A function which is analytic for all values of z is called an 
entire function or an integral function, and Liouville’s theorem 
thus states that any entire junction which is bounded far all values 
of z is necessarily a constant. 

EXERCISES 

1 Expand /(z) = (z — 1 )/(z + 1) in a Taylor series (a) about the point z = 0 and (b) about 
the point z — 1. Determine the region of convergence in each case. 

2 Expand /(z) = cosh z in a Taylor series about the point z = itr. What is the region of con- 
vergence of the resulting series? 

3 Expand /(z) = z/(z + l)(z + 2) in a Taylor's series (a) about the point z = Oand (b) about 
the point z = 2. Determine the region of convergence in each case. 

Without obtaining the series, determine the radius of convergence of each of the following 
expansions: 

4 tan z around z = 0 5 Tan -1 z around z = 1 

6 l/(e* — 1) around z = 4i 7 x/fa: 2 + 2x + 5) around x == 1 

8 Prove that every polynomial equation P(z) — 0 has at least one root. [Hint: Assume the 
contrary and apply Liouville’s theorem to the function /(z) - 1/P(z).] 

9 Prove that, if the Taylor expansion of a function around a given point exists, it is unique. 


* Named for the French mathematician Joseph Liouville (1809-1882), but 
actually due to Cauchy. 


692 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


10 Prove that, if a n s n converges absolutely at one point on its circle of convergence, then 

71 =0 

it converges absolutely and uniformly in the closed region bounded by its circle of 
convergence. 


15.3 

Laurent's expansion 

In many applications it is necessary to expand functions around 
points at which or in the neighborhood of which the functions are 
not analytic. The method of Taylor’s series is obviously inappli- 
cable in such cases, and a new type of series known as Laurent’s 
expansion* is required. This furnishes us with a representation 
which is valid in the annular ring bounded by two concentric 
circles, provided that the function being expanded is analytic 
everywhere on and between the two circles. As in the case of 
Taylor’s series, the function may have singular points outside 
the larger circle, and, as the essentially new feature, it may also 
have singular points within the inner circle. The price we pay 
for this is that negative as well as positive powers of z — a now 
appear in the expansion and that the coefficients, even of the 
positive powers of z — a, cannot be expressed in terms of the 
evaluated derivatives of the function. The precise result is given 
by the following theorem: 


THEOREM 1 

If f(z) is analytic throughout the closed region R bounded by two concentric 
circles, then at any point in the annular ring bounded by the circles, f(z) can be 
represented by the series 

f{z) = jp a n (z - a) n 

where a is the common center of the circles and 

a — J— [ M di 

n ~ 2 m* Jc ( t - a) n+1 

each integral being taken in the counterclockwise sense around any curve C 
lying in the annulus and encircling its inner boundary. 


PROOF Let z be an arbitrary point of the given annulus. Then according to 
Cauchy’s integral formula we can write 


f(s)= ± f /(*) dt 

3 K ! 2 in Jc t +c, t - z 

_ i f m dt i r 

2iri jCi t — z 2wi jCi 


f(t) dt 
t — z 


where C 2 is traversed in the counterclockwise direction and C\ is traversed in 


Named for the French mathematician Hermann Laurent (1841-1908). 


SEC. 15.3 


LAURENT’S EXPANSION 


693 



the clockwise direction, in order that the entire integration shall be in the positive 
direction (Fig. 15.3). Reversing the sign of the integral around C i and also chang- 
ing the direction of integration from clockwise to counterclockwise, we can write 


f( s ) = JL f M) dt L f M) dt 

2 «' )Ct t — z 2iri Jci t — z 


= J_ f /(*) 

2irt Jci t — a 1 - 


1 


(z — a)/(t — a) 


dt 


+ —. [ 
2m JCi 


m 


i 


Jciz — a 1 — (t — a)f{z — a) 
Now, in each of these integrals let us apply the identity 
1 


1 - 


• = 1 + u -f- u* + 


+ 


to the last factor. Then, 

f( z ) = _L [ JL L f i _j_ ! 

JK ) 2 ri jG,t- a\_ ^ t 


L f ML ' 

ir i JCi z — a 


1 +- 


i 

+ 

-“+■■■ + 
— a 

+ - 


c^r 

y* 

(Mr 

(i - a)V(» - a)^ j dt 


(z - a) n /(t - a) 
1 - (z - a)/(t 


1 - (t - a)/{z 

= _L f M) dt 4. 2 ~ a f M) dt . 

2iri Jet t — a 2 iri Jc, (t — a) 2 


. ~ a )"' 


1 f JSL 
" Jc , (f - 


/(*) cfr 


+ ^-kf^r + R - 


+ 5HF=15 ScJ « * + 2 S 5 W’ k + 

" ■ + ari(« - a)> k ( ‘ - o) "^<° * + B “‘ 


(g - «)" r 

2ri JCt 


m dt 


Rnl - 


(t — a) n (i — z) 

1__ r (t — a) n f(t) dt 

— a)” Jci 


694 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


The truth of the theorem will be established if we can show that 
lim R n 2 = 0 and lim R n i = 0 

The proof of the first of these equations we can pass over without comment, 
because it was given in complete detail in the derivation of Taylor’s series in 
Sec. 15.2. To prove the second, we note that, for values of t on Ci (Fig. 15.3), 


I* — <A - n 


say, where p>r x 
- a) - (t - a)\ ^ P - 


\z~t\ = | (z 
and |/(0! £ M 

where M is the maximum of \f{z)\ on Ci. Thus, 

1 f (t - a) n f(t) dt 


n 


[Rm | - 


'•So} 


| 2 vi(z — a) 

|2«j | z — o| n fc 
rt'M 


u - <*11/(01 m 


ic, 


\z - *| 


s 2 T P »(p - 77) L ^ 

2irn 
p-n 

r x 


M (rjy 
' 2 r\p) 

*(?)' 


Since 0 < r x /p < 1, the last expression approaches zero as n becomes infinite. 
Hence, lim R n i ~ 0; and thus we have 


- b k^Ta + [b k $&] (2 - a) 

+ [b k m - *] rb-a + lb Sc « ~ -w> *] + ■ " 

Since f(z) is analytic throughout the region between C\ and C 2 , the paths of 
integration C x and C 2 can be replaced by any other curve C within this region 
and encircling C x . The resulting integrals are precisely the coefficients a n described 
by the theorem; hence, our proof is complete. 


It should be noted that the coefficients of the positive powers 
of z — a in Laurent’s expansion, although identical in form with 
the integrals of Theorem 9, Sec. 14.8, cannot be replaced by the 
derivative expressions 

/ w (a) 

n\ 

as they were in the derivation of Taylor’s series, since /(z) is not 
analytic throughout the entire interior of C 2 (or C ), and, hence, 
Cauchy’s generalized integral formula cannot be applied. Spe- 
cifically, /(z) may have many points of nonanalyticity within Ci 
and, therefore, within Ct (or C). 


SEC. 15.3 


LAURENT’S EXPANSION 


695 


In many instances the Laurent expansion of a function is 
found not through the use of the last theorem, but rather by 
algebraic manipulations suggested by the nature of the function. 
In particular, in dealing with quotients of polynomials, it is often 
advantageous to express them in terms of partial fractions and 
then expand the various denominators in series of the appropriate 
form through the use of the binomial expansion, which we list 
here for reference: 


THEOREM 2 
The expansion 

(. + ()■ = .• + m-H + a-V + 2) + . . . 

is valid for all values of n if |s| > \t\. If |s| 5jj \t\ the expansion is valid only if n 
is a nonnegative integer. 

That such procedures are correct follows from the fact that the 
Laurent expansion of a function over a given annulus is unique. 
In other words, if an expansion of the Laurent type is found by 
any process, it must be the Laurent expansion. 


EXAMPLE 1 


Find the Laurent expansion of the function /(z) = (7 z — 2)1 (z + 1 )z(z — 2) in the annulus 
1 < \z + 1| < 3 

As a preliminary step it is convenient to apply the method of partial fractions to f(z) and 
express it in the form 


m - 


_zl + i + _2_ 

z+1 z z-2 


Now, after suitable rearrangement, these terms can be expanded into infinite series by means of 
Theorem 2 and added to give the required expansion for f(z). 

To do this, we observe that since the center of the given annulus is z — —1, the series we 
are seeking must be one involving powers of z + 1. Hence, we modify the second and third terms 
in the partial-fraction representation of f(z) so that z will appear in the combination z + 1. 
This gives us the equivalent expression 


f(2) z + 1 + (z + 1) - 1 + (z + 1) - 3 

= — 3(z + I)-* + [(z + 1) - l]~ l + 2[(z + 1) - 3]~i 


However, according to Theorem 2, the series for [(z + 1) — 3]~ l will converge only where 
|z -H 1| >3, whereas we require an expansion valid for |z + 1| < 3. Hence, we rewrite this term 
in the other order, [— 3 + (z -f- l)] -1 , before expanding it. Now we can apply Theorem 2, 
obtaining 

/(z) = — 3(z + l)" 1 + [(3 + 1) - l]~ l + 2[— 3 +(z + 1)H 

= — 3(z + I)-* + [(z + l)- 1 + (z + I)" 2 + ( + I)" 3 + • • •] 

r _ i _ i ±i _ (z + 1)2 _ (z ± 1)3 „ . . .i 

+ [ 3 9 27 81 J 

= • • • + (z +.1)-* - 2(z + l)-» - | - f (z + 1) 

- j&(z + 1)* - • • • 1 < |z + 1| < 3 


696 


INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. 15 


It is important to note that /(z) has two other Laurent expansions around the point z = — 1. 
One is valid in the annular region between a circle of arbitrarily small radius around z = -1 
and a circle of unit radius around z = — 1. The other is valid in the region exterior to a circle of 
radrUs 3 around z — — 1 (Fig. 15.4). Each of these can be found, as above, by suitably rearrang- 



ing the terms in the partial-fraction representation of f(z) and then expanding these terms by 
means of Theorem 2. Thus, in the innermost region we have 

f(z) - -3(2 + 1)~‘ + [-1 4 - (z + I)]' 1 + 2[-3 + (2 + I)]" 1 

= -3(2 + l)- 1 + [~1 - (z + 1) - (z + l) 2 - (z + l) 3 ~ ■ • •] 

r_ i _ i+i _ (z + p* _ ( g + 1 ) 3 _ . 1 
+ ~ L 3 9 27 81 J 

- -3(2 + 1)-1 - H - Htffe + 1) - 2 ^7(g + l) 2 

- 8 %i(g + l) 3 - • • • 0 < \z + 1| < 1 

Similarly, in the outermost region we have 

/(z) « ~3(z + l)-i + [(2 + 1) - l]-‘ + 2 [(2 + 1) - 3]-‘ 

= — 3(z + l) -1 + [(2 + l) -1 + (z + 1)~ 2 + (g + 1)~ 3 + • • •] 

+ 2[(z + l)-> + 3(z + 1)~ 2 + 9(z + l)" 3 + • • •] 

- • • • + 19(2 + l)- 3 + 7(z + 1)~ 2 |g + 1| > 3 

Incidentally, the fact that we have obtained these Laurent expansions without using the 
general theory means that we can evaluate the integrals in the coefficient formulas by comparing 
them with the numerical values of the coefficients we have found by independent means. For 
instance, in the first expansion the coefficient of (2 + 1) _1 is —2. On the other hand, according to 
the theory of Laurent's expansion, the coefficient of this term is 

a _, = J- f f( z ) dz = ~ f ■ - ’till dz 

2 « JC JK ,a . 2 xi JC (z + 1)2(2 - 2) 

where C is any closed curve lying in the interior of the circle \z + 1| = 3 and enclosing the circle 
|z + 1 1 = 1. Thus, although we have done nothing resembling an integration, we have none- 
theless shown that 

± f- 7z - 2 J.--2 or [- 7 *~ 2 J,--4ri 

2--i la if + i)z(i - 2 ) la (2 + l)<* - 2) 

a result, incidentally, which could not have been obtained by a direct application of Cauchy’s 
integral formula, as in Example 2, Sec. 14.8. 



LAURENT’S EXPANSION 


1 Expand /(z) = l/(z - l)(z - 2): 
a For |z| < 1 b I 


a For |z| < 1 b For 1 < |a| < 2 c For 2 < jz| 

d For 0 < \z — 1| < 1 e For |z — 1| > 1 

f For 0 < \z - 2| < 1 g For |* - 2| > 1 

2 Obtain two distinct Laurent expansions for/(z) = (3 z + l)/(z= - 1) around z = 1, and tell 
where each converges. 

3 Expand /(z) = l/z 2 (z — i) in two different Laurent expansions around z = i, and tell where 
each converges. 

4 Construct all the Laurent expansions of /(z) = l/z(z - l)(z - 2) around z = -1, and tell 
where each converges. 

6 Find the value of f(z) dz if C is the circle |z| = 3 and/(z) is 

a 1 b g +2 1 

2(2 + 2) z(z + 1) C (z + 1)» 


z(z + l) 2 (a + l)(* + 2) 

6 If k is a real number such that fc 2 < 1, prove that 


k" sin (n + 1)0 = - 


[Hint: Expand (z — ft) -1 for |z| > k, set z = e'°, and equate real and imaginary components 
in the resulting expression.] 

7 Criticize the following argument: Since (by long division, for instance) 


therefore, by adding these two series we obtain 


8 Criticize the following argument: The series 


- + 1 +*+■••• +z 


- +~ - 

z 1 — z z 


- 1 - 



INFINITE SERIES IN THE COMPLEX PLANE 


CHAP. IS 


and this expression clearly approaches 0 as n becomes infinite for all values of z such that 
|z| < 1. 

/ A 

powers of z is f(z) — 


9 a Show that the Laurent expansion of /(z) = sinh 
^ a n z n , where 


sinh ^z + in 


1 fUr 

a„ = — / cos nd sinh (2 cos 0) dff 


(Hint : In the formula for a„ provided by Theorem 1, take the curve C to be the circle | z\ = 1 . 
On this circle let the variable of integration l be taken in the form t = e ,e . Finally, verify 
that the imaginary part of the integral for a n is equal to zero.) 

b Show that the coefficients in the Laurent expansion of f(z) = sin (■*9 in powers of 

I /"2-r 

s are given by the formula a* — — cos nO sin (2 cos d) d6. 

c What are the coefficients in the Laurent expansion of /(z) = cosh ^z + in powers of z? 

d What are the coefficients in the Laurent expansion of /(z) = cos ^z + - ^ in powers of z? 

10 Let /(z) = - F{r,d) be a function -which is analytic in some annulus having the origin 

as center and containing the circle |z| = 1. Taking this circle as the curve C in the formula 
for a B in the Laurent expansion of /(z), show that 


1 f2jr 1 v r 2* 

F(r,*)=-/ o ^)d,+-2/ 0 


F(l,4>) cos n(d — ) d<f> 


CHAPTER SIXTEEN 


The Theory 
of Residues 


16.1 

The residue theorem 


In Sec. 14.6 we defined a singular point of a function f{z) as a 
point where /(z) is not analytic but in every neighborhood of 
which there are points where /(z) is analytic. If z = a is a singular 
point of the function /(z), but if there exists a neighborhood of a 
in which there are no other singular points of /(z), then z = a is 
called an isolated singular point. Clearly, if z = a is an isolated 
singularity of /(z), then /(z) will possess a Laurent expansion 
around z — a which will be valid in the interior of an annulus 
whose outer radius is the distance from a to the nearest of the 
other singular points of /(z) and whose inner radius can be taken 
arbitrarily small. 

If the Laurent expansion of /(z) in the neighborhood of an 
isolated singular point z = a contains only a finite number of 
negative powers of z — a, then z = a is ’called a pole of /(z). If 
(z — a)~ m is the highest negative power in the expansion, the 
pole is said to be of order m, and the sum of all the terms contain- 
ing negative powers, namely, 


o_ m 

(z - a) n 


+ ’ 


• • + 


2 

(z - a) 2 


+ 


a_ i 

z — a 


is called the principal part of /(z) at z = a. If the Laurent expan- 
sion of /(z) in the neighborhood of an isolated singular point 
z = a contains infinitely many negative powers of z — a, then 
z = a is called an essential singularity of /(z). For instance, since 


1 [1 + (z - l)]' 1 

z(z - l ) 2 " (« - l ) 2 

1 1 
(z - l) 2 z - 1 


+ 1 - (z - 1) + 


0 < |z - 1| < 1 


700 


THE THEORY OF RESIDUES 


CHAP. 16 


I 


this function has a pole of order 2 at 2 = 1, and its principal 
part there is 



On the other hand, since e 1/a is represented for all values of z 
except 2 = 0 by the series 


pUi — I 4. I j ?: 1 \ 1 - U 

^ 2 ^ 2!z 2 ^ 3!2 3 ^ 4!z 4 ^ 


it has an essential singularity at the origin. 

In passing, we note that, if the terms in the expansion of 
f(z) around a pole of order m, say z = a, are put over a common 
denominator, f(s) will contain the factor l/(z — a) m . Conversely, 
if a function /(z) is expressed as a fraction in lowest terms, then 
the presence of a factor of the form (2 — a) m in the denominator 
implies that/(z) has a pole of the ?«th order at 2 = a. In most appli- 
cations this is the way in which the poles of a function are found. 

As we suggested at the end of the last chapter, the coefficient 
a-i of the term (2 — a)- 1 in the Laurent expansion of a function 
f(z) is of great importance because of its connection with the 
integral of the function, through the formula 



In particular, the coefficient of (2 — a) -1 in the expansion of /(z) 
in the neighborhood of an isolated singular point is called the 
residue of f(z) at that point. 

Now consider a simple closed curve C containing in its 
interior a number of isolated singularities of a function f(z). If 
around each singular point we draw a circle so small that it 
encloses no other singular points (Fig. 16.1), these circles, 
together with the curve C, form the boundary of a multiply con- 
nected region in which f(z) is everywhere analytic and to which 
Cauchy’s theorem can, therefore, be applied. This gives 

h fc f(z) * + a- L m * :+ ■ • • + id L, m * “ 0 
If we reverse the direction of integration around each of the 
circles and change the sign of each integral to compensate, this 

t It should be noted that, although we can also write 
1 ((2 - 1) + l]~i 

2(2 - l)^ (2 - 1)2 

= • ' • + ( z _ 1)5 ~ (z - 1)4 + ( 2 - 1)3 N — 1| > 1 

the fact that this expansion contains infinitely many negative powers of 
2 — 1 does not contradict our observation that 1/2(2 — l) 2 has a pole of 
order 2 at 2 = 1. For this series is valid only outside the circle |z — 1) = 1, 
whereas the presence of poles and essential singularities is determined by the 
structure of the particular Laurent expansion whieh is valid in the innermost 
annulus, or deleted neighborhood, of the point in question. 


SEC. 16.! 


THE RESIDUE THEOREM 


701 



2 a JoW dl - 2 k />> * + ■•••+» L™ dz 

where all the integrals are now to be taken in the counterclockwise 
sense. But the integrals on the right are, by definition, just the 
residues of f(z) at the various isolated singularities within C, 
Hence we have established the important residue theorem : 

THEOREM 1 

If C is a closed curve and if / (z) is analytic within and on C except at a finite num- 
ber of singular points in the interior of C, then 

J c f(z) dz = 2ri(r l + r 2 + • * ■ + r n ) 

where r h r 2 , . . . , r n are the residues of f(z) at its singular points within G. 


What is the integral of 
f(z) = 


*(* -!)(*- 2 ) 


around the circle \z\ - %? 

In this case, although there are three singular points of the function, namely, the three 
first-order poles at % ~ 0, z = 1, and 3 = 2, only 3 = 0 and z - I lie within the path of integra- 
tion. Hence, the core of the problem is to find the residues of f(z) at these two points. 

To do this, it is natural to begin by constructing the partial-fraction representation of /(z), 
namely, 

*>-’-rh-rb 

Then, in the neighborhood of z = 0, we can write 


/(») = - + (1 - 2)- 1 + (2 - z )~' 


- ^ + (1 + z + z 2 + • - •) + 0 + \ ^ ^ 


_? + 5 + S, + 2,.+ . 

3 2 4 ~ 8*- 



THE THEORY OF RESIDUES 


CHAP. 16 


Hence, the residue of f(z) at z — 0, i.e., the coefficient of the term 1/z in the last expansion, is 2. f 
Also, in the neighborhood of z — 1, we have 

/(z) - 2[1 + (z - l)]" 1 - + [1 - (z - l)]~ l 

= 2[l - (z - 1) + (z ~ l) s - • * •] - 4 [l 4 (z - 1) + (z - D* 4 • • '] 

= + 3 - (z - 1) + 3(z - 1)* - • • • 


Hence, the residue of f(z) at 3 = 1 is —1. Therefore, according to the residue theorem, 


Jc z(g — l)(z — 2) 


2«[<2) 4 (-1)1 = 2?ri 


The determination of residues by the use of series expansions, 
in the manner we have just illustrated, is often tedious and some- 
times very difficult. Hence, it is desirable to have a simpler alter- 
native procedure. Such a process is provided by the following 
considerations: Suppose first that/(s) has a simple, or first-order, 
pole at z — a. It follows, then, that we can write 

/(z) = + a 0 + ai(s -«)+••' 

2 — a 

If we multiply this identity by z — a, we get 
(z — a)f(z) - a_i + a 0 {s — a) + 01(2 — a ) 2 + ■ • ■ 

Now, if we let 2 approach a, we obtain for the residue 

( 1 ) a~i = lira [(2 - a)f(z)] 

If f(z) has a second-order pole at 2 = a, then 

/(*) = . a ~ 2 si 4 - ~~ 4 - Oo + 01(2 - a) + <*2(3 - a ) 2 + • • • 
(2 — a) 2 — a 

Now, to obtain the residue o_ 1, we must multiply this identity 
by (2 — a ) 2 and then differentiate with respect to 2 before we 
let 2 approach a. The result this time is 

( 2 ) o-i = lim ~ [(2 - a) 2 f(z)] 

The same procedure can be extended to poles of higher order 
leading to the formula contained in the following theorem : 


THEOREM 2 

If f(z) has a pole of order m at 2 = a, then the residue of f{z) at 2 = a is 


0-1 = 


[(2 - a)”i{z)] 


f Since l/(z — 1) and l/(z — 2) are both analytic in the neighborhood of 
3 = 0, it is evident in advance that their expansions around z = 0 will be, 
not Laurent, but Taylor series and, hence, will contain no negative powers 
of z. Thus neither of these terms can contribute to the residue of f(z ) at 
3 = 0, and so it is actually unnecessary to obtain their expansions. The same 
thing is true of the terms 2 jz and l/(z — 2) around 3 = 1. 


SEC. 16.1 


THE RESIDUE THEOREM 


703 


In many problems the order of the pole at z — a will not 
be known in advance. In such cases it is still possible to apply 
Theorem 2 by taking m — 1, 2, 3, . . . , in turn, until for the 
first time a finite limit is obtained for a_i. The value of m for 
which this occurs is the order of the pole, and the value of a_i 
thus determined is the residue. If / ( 2 ) has an essential singularity 
at z = a, however, this process fails, and the residue cannot be 
determined by means of Theorem 2. 


example 2 

What is the residue of /(z) = (1 + z)/(l — eos z) at the origin? 

Here the order of the pole is unknown; so it appears that we may have to proceed tentatively, 
trying m = 1,2,3, . . . , in turn, until for the first time we obtain a finite value for the residue 
a_i. However, if we replace cos s by its Maclaurin expansion, we obtain for f(z) the expression 


1 +g 2(1+2) 

"'(V-f + S- •") '’( 1 ~3 + '") 

and the factor z 2 in the denominator now identifies the pole as of the second order. Hence, apply- 
ing Theorem 2 with m = 2, we have 


— lim 2 / 

*-o / z 2 , V 


EXERCISES 

1 Find the residue of /(z) = z/(z 2 + 1) (a) at z — i and (b) at z =*> — t. 

2 Find the residue of f(z) = (z + 1 )/z 2 (z — 2) (a) at z = 0 and (b) at z — 2. 

3 Find the residue of /(z) = z/(z 2 + 2z + 5) at each of its poles. 

4 What is the residue of /(z) = l/(z + l) 3 at z = —1? 

6 What is the residue of /(z) = tan z at z = irf 2? 

6 What is the residue of /(z) = z/(cosh z — cos z) at 2 = 0? 

7 What is the residue of /(z) = l/(z — sin z) at z — 0? 

8 What is the residue of /(z) = l/(e* — 1) at z = 0? 

9 If C is the circle |z| = 4, evaluate J c f(z) dz for the functions: 


a /(z) 


f(z) = 
/(z) = 


1 

z(z - 2) 3 
1 

z 2 + z + 1 


b /(z) 


z+1 

z 2 (z + 2) 


d ^ (z) (z 2 + 3z + 2) a 

f = z(z 2 + 6z + 4) 



704 


THE THEORY OF RESIDUES 


CHAP. 16 


10 If C is the circle js} = 2, evaluate f(z) dz for the functions: 

a f(z) - tan 3 b f(z) = 

z sm z 

c f(z) = -t — — d j{z) = ~ 

Z i Sin 2 2 a 

e f(z) = ze lh i f(z) “ — 

COS 2 

16.2 

The evaluation of real definite integrals 

There are several large and important classes of real definite 
integrals whose evaluation by the theory of residues can be made 
a routine matter. The results in question are contained in the 
next three theorems. 


THEOREM 1 

If R(cos 0,sin 0) is a rational function of cos 0 and sin 0 which is finite on the 
closed interval 0 g 0 £ 2-rr and if f(z) is the function obtained from R by the 
substitutions 

„ z + z -1 . „ z — z~ 1 

cos 9 = — sin 9 = — 

then j Q jffi(cos 0,sin 0) d& is equal to 2 id times the sum of the residues. of the 

f(g) 

function at such of its poles as lie within the unit circle |s| = 1. 


PROOF As a first step, let. us transform the given integral by means of the 
substitution z — e' 8 , according to which 


» _|_ e ~i9 
~~2 


. z + z~ 


Under this transformation the original integrand becomes a rational function 
of z, which we ca Furthermore, as 0 ranges from 0 to 2ir, the relation z — e iS 
shows that z ranges around the unit circle |g| = 1. Hence, the transformed inte- 
gral is 


Jo* ■>§ 


where C is the unit circle. By the residue theorem, the value of this integral is 
2-rri times the sum of the residues at those poles of its integrand, namely, f (z)/is, 
which lie within the unit circle. Since this integral is equal to the original one, 
the theorem is established. 



SEC. 16.2 


THE EVALUATION OF REAL DEFINITE INTEGRALS 


705 


EXAMPLE 1 

^ , /' 2 tt cos 20 de 

Evaluate — - — — - - — - (-1 < p < 1). 

JO 1—2 p cos 6 + p- y 

Since the denominator of the integrand can be written 

1 - 2p cos 6 + p« = 1 - 2p + p 2 + 2p - 2p cos 0 = (1 - p)= + 2p(l - cos 0 ) 

= 1 + 2p + p i — 2p - 2p cos 6 = (1 + p) 2 - 2p(l + cos 6) 

it is clear that it can never vanish for 0 g 6 g 2ir if — 1 < p < 1. Hence, the preceding theorem 

is applicable. Now 


cos 20 


e ue e -ub 
2 


g 2 + 
2 


and thus the given integral becomes 

f Z* + 3~ 2 


dz _ [ z l + 1 

1 - 2p(s + s-i)/ 2 + p 2 'Tz " J 2z*~ ' 


-p + p-z iz 


- f — ILi 

J 2ig 2 (l - 


(1 + 3*) dg 


2« 2 (1 - pg)(g - P) 

Of the three poles of the integrand, only the first-order pole at z = p and the second-order 
pole at 2 = 0 lie within the unit circle. For the residue at the former we have 




1 +? 


- p) 2ip ! (l — p 2 ) 
For the residue at the second-order pole z = 0, we have 
1+J* _1 


2 iz\z — pz 2 — p + p 2 z) J 

_ (g — pz 2 — p + p 2 z)(4z 3 ) — (1 -f z 4 )(l - 2 pz + p 2 ) 
2t(z — pz 2 - p + p 2 z) 2 


1 + P i 

2ip 2 


By Theorem 1, the value of the integral is therefore 


2 tii 


1 + p 4 
2ip--{\ - p 2 ) 


1 +P* ] 

2 ip 2 J 


27TJ3 2 
1 - p- 


THEOREM 2 

If Q{z) is a function which is analytic in the upper half of the 2 -plane except at a 
finite number of poles none of which lies on the real axis and if zQ(z ) converges 
uniformly to zero when 2 —+ «= through values for which 0 Ss.arg 2 S w, then 
/- eo i' s equal to 2id times the sum of the residues at the poles of Q(z) which 

lie in the upper half plane. 

PROOF We consider a semicircular contour with center at z — 0 and with 
radius R large enough to include all the poles of Q(z) which lie in the upper half 
plane (Fig. 16.2). Then, by the residue theorem, 

J Ci+Ci Q{z) dz — 2 ^ residues of Q(z) at all poles within C 1 + C 2 

f- R Q(*) dx-+ j C} Q(z) dz = 2 ri ^ residues 


or 


706 


THE THEORY OF RESIDUES 


CHAP. 16 



Hence, 

(1) j J* R Q(x) dx — 2 iri ^ residues | = ] — J Ct Q{z) dz J 

In the integral on the right, let z - Re ie , so that dz — Rie ie dd = iz de. Then 

|- f'.'QWdz |=j- l’Q(z)izde | a f 0 r \z<Kz)\\de\ 

But from the hypothesis that \zQ(z) \ converges uniformly to zero when z — » oo and 
0 £a arg z g ir, it follows that, for any arbitrarily small positive quantity, say e/ir, 
there 'exists a radius R 0 such that 

Mfof < l 

ir 

for all values of z on C 2 whenever R > R n . Thus, for R > Ro, 

fj l*Q(*)| \dB\ < ^ jj ld0| = e 

This, coupled with (1), proves that 

lim J^ R Q(x) dx = 2 ici ^ residues 

Since the limit on the left is what we mean by Q(x) dx, f the theorem is 
established. 

In particular, the quotient of two polynomials p(x)/q(x) 
automatically satisfies all the hypotheses of the last theorem 
whenever the degree of the denominator exceeds the degree of the 
numerator by at least 2. Hence, we have the following highly 
important corollary: 


f Actually ^lim j_ R Q(x) dx is only the principal value of the integral 

J Q(x) dx, whose correct definition is 

lim Q(x ) dx 4- lim f S Q(x ) dx 

R-*<x, J — R S~*‘ a Jo 

where R and S become infinite independently of each other. As the simple 
function Q(x) s x shows, the principal value of an integral may exist when 
the integral itself is undefined. However, under the relatively stringent 
conditions of Theorem 2 the existence of the principal value implies the 
existence of the integral itself. 


SEC. 16.2 


THE EVALUATION OP BEAL DEFINITE INTEGRALS 


707 


COROLLARY 1 

If p(k) and q(x) are real polynomials such that the degree of q(x) is at least 2 more 
than the degree of p(x) and if q(x) — 0 has no real roots, then 

/-” ~ 2irz X res ^ ues * ts P°les in the upper half plane 

EXAMPLE 2 

/ so 

— — 

- ® (x 2 -f o 2 )(x 2 + fe 2 ) 

This is an integral to which the corollary of Theorem 2 can surely be applied. The only poles 
of 


( 2 s + a a )(z 2 + 6 2 ) 

are at z = ±ai, ±bi. Of these, only z = ai and z — hi lie in the upper half plane. At z = ai the 
residue is 

lixxi (z a, "*) ~~~ ci^ a 

~ m (z — ai)(z +ai)(z 2 + 6 2 ) = 2ai(b^~a^ " 2i(a s - 6 2 ) 


From symmetry, the residue at z = hi is obviously b/2i{b' i — a~ ). Hence, the value of the 
integral is 

2irt I " d" * 

[2 t(a* - d 2 ) 2f(6* 

If Q(z) satisfies all the hypotheses of Theorem 2, then so does 
e im ‘Q(z), provided w > 0. For e im * is analytic everywhere, and, 
under the assumption that m > 0, its absolute value is 

jgimzj _ jgfni(x+<y)| _ Jgtmxg— mj/| _ g-my 

which is less than or equal to 1 for all values of y in the upper half 
plane. Therefore, 

\e im *zQ(z)\ ^ \zQ(z)\ 

and thus, if the latter converges uniformly to zero when z — ► « 
and 0 ±3 arg z ^ ir, so will the former. Hence, the conclusions of 
Theorem 2 can be applied equally well to e imz Q(z), and we can 
write 

(2) e imx Q{x) dx = 2iri ^ residues of e imt Q(z) at its poles in the 

upper half plane 


-«")J o+6 


Separating the integral in (2) into its real and its imaginary parts 
and equating these to the corresponding parts of the right-hand 
side, we obtain the following useful result : 


COROLLARY 2 

If Q(z) is analytic in the upper half of the z-plane except at a finite number of 
poles none of which lies on the real axis and if |zQ(z)| converges uniformly to zero 

'O' 


708 


THE THEORY OF RESIDUES 


CHAP. 16 


when z becomes infinite through the upper half plane, then 

cos mx Q(x) dx — — 2-r ^ imaginary parts of the residues of 
e im: Q(z) at its poles in the upper half plane 
sin mx Q(x) dx — 2ir ^ real parts of the residues of e im *Q{z) 
at its poles in the upper half plane 


EXAMPLE 3 


_ , /"* cos mx 

Evaluate j j- — -dx. 

To do this, we consider the related function e i,ni /(l + zi )- The only pole of this function in 
the upper half plane is z — i, and the residue there is 


lim ( z — i) 


(z - i){z + i ) 


Hence, by Corollary 2, 



Incidentally, the fact that the residue at z = i is a pure imaginary quantity confirms the observa- 
tion, obvious from symmetry, that 


sin mx , 

J-. r+5* 


0 


As a final result on the evaluation of real definite integrals by 
the method of residues, we have the following theorem, whose 
proof we omit because of its relative intricacy.* 


THEOREM 3 

If Q(z ) is analytic everywhere in the 2 -plane except at a finite number of poles 
none of which lies on the positive half of the real axis and if | 2 a Q(s)| converges 
uniformly to zero when z — > 0 and when z — * <* , then 

j” x a ~ l Q(x) dx — residues of (—z) a ~ l Q(z) at all its poles 

provided that arg z is taken in the interval (— ir,ir). 

In applying this theorem it must be borne in mind that unless a 
is an integer, (~z) a ~ l is a multiple-valued function which, ac- 
cording to Eq. (23), Sec. 14.7, is to be interpreted as 

( — _ g(a— 1) In (— z) g(a— 1) [In Izl+t arg (— z)] — 7T arg 2 ~ IT 


EXAMPLE 4 

Evaluate £“ ~ — - dx, 0 < a < 2. 

Jo l +x i 

For a within the specified range, the conditions of Theorem 3 are fulfilled; hence the given 
integral is equal to ir/ (sin air) times the sum of the residues of (— z)*~ J /( 1 + z2 ) at z — ±i. At 


* See, for instance, E. T. Whittaker and G. N. Watson, “Modern Analysis,” 
p. 117, The Macmillan Company, New York, 1943. 


SEC. 16.2 


THE EVALUATION OF REAL DEFINITE INTEGRALS 


709 


= i we have for the residue 
lim (z — i) - 




(z - i)(z + i) 
-i, for the residue we have 


lim (z + i) - 


(z + i)(z - z) 
The value of the integral is, therefore, 


2 _ e -.T(0-« 


(-O*- 1 _ (<pW2)«-» - g-irca-o/t 

2i 2 i 2 i 

_ i *- 1 _ (e jT/s ) a -i _ 

" -2 i “ — 2i — 2i 


«• . (a — l)ir 

— sin 

sm air 2 


t dir ir 

sin air 2 2 sin (ar-/2) 

For definite integrals not covered by the theorems of this 
section, evaluation by the method of residues, when possible at 
all, usually requires considerable ingenuity in selecting the ap- 
propriate contour and in eliminating the integrals over all but the 
desired portion of the contour. Several examples of this sort will be 
found, with hints, in the exercises. 


EXERCISES 

Evaluate the following integrals by the method of residues: 
dO 



JO (x + b)(x + c)(x + d ) 

23 /“ — - dx 0 < a < 

Jo 1 + x 3 


f” - X ° .— rfx 

h 1 + x* 


THE THEORY OF RESIDUES 


CHAP. 16 


26 Show that r(a)r(l — a) = *-/(sin aw) for 0 < a < 1. (Hint: Consider the integral 

[ — dy, and evaluate it first by the method of residues and then by making the 

Jo l+y 

substitution y = x/(l — a;) and expressing it in terms of gamma functions.] 

26 Show that l dx - -• (Hint: Integrate e'*/z around the contour shown in Fig. 16.3, 

JO x 2 

and let r -* 0 and R -» *> .) 



28 If f(z) has a number of first-order poles on the real axis, but otherwise satisfies all the condi- 
tions of Theorem 2, show that the principal value of J e imz f(x) dx is equal to 2ir i times 
the sum of the residues of e imz f{z) at its poles in the upper half plane plus iw times the sum of 
the residues of e iml f(z) at its poles on the real axis. [Hint: Use a contour like that shown in 
Fig. 16.3, suitably indented around each of the poles of f{z) which lies on the real axis.] 

29 What is the Fourier expansion of the periodic function — (0 < 6 < a)? Discuss 

a + b cos 8 

from the point of view of Theorem 3, Sec. 6,3, the limiting behavior of the Fourier coeffi- 
cients of this function as n » . 

30 Show that f -- -- -- dx = — - — — [Hint: Integrate the function e im */(e* + e~ z ) 

J - ® e x + e~ x e m * n + e~ mVn 

around the contour shown in Fig. 16.5 and let 12 — * «> .] 



C 3 

7 r 




'if';;: -v| 

|c 2 


Cl I 

l X 



FIGURE 16.5 


SEC. 16.3 


THE COMPLEX INVERSION INTEGRAL 


16.3 

The complex inversion integral 


( 1 ) 

(2) 


(3) 


We are now in a position to appreciate more fully the sig nifi cance 
of the complex inversion integral of Laplace transform theory. 
In Sec. 6.8 we defined the Laplace transform of a function /(i) 
to be 

£{/(*)} - Me-* dt 
and we showed that conversely 

m ■■ 


h JCT 


2m 

s being a complex variable. It is interesting now to reconsider 
the derivation of (2) in the light of complex variable theory and 
to investigate how this formula can be applied to the determina- 
tion of a function when its transform is known. 

In the complex plane let <£(z) be a function of z, analytic on 
the line x — a and in the entire half plane R to the right of this 
line. Moreover, let |<£(z)| approach zero uniformly as z becomes 
infinite through this half plane. Then, if s is any point in the half 
plane R, we can choose a semicircular contour C = Ci + Ci, 
as shown in Fig. 16.6, and apply Cauchy’s integral formula, 
getting 

*(,)_ ± [ -*« * 1 r~* *VL «fe+ 1 f M . * 

2 m Jc z — s 2m Ja+tb z — s 2m Jc» z — s 

Now, for values of z on the semicircle C 2 and for b sufficiently 
large, we have 

\z — s| 2: b — |s — a\ ^ 6 — |s| — |a[ 

whether a is positive, as shown in Fig. 16.6, or negative. Hence, 
letting M denote the maximum value of |<#>(z)| on Ci, we have 

♦to ... I ^ r ktol .... _ M r , rbM 




1*1 - 1*1 /ft " 


b - \s\ - \a\ 


FIGURE 16.6 
The contour used 
to obtain the 
complex inver- 
sion integral. 




s \ 


b 

/ v 


„ 

A »'-a! \ 

(.1 


I * 



Cl / 


' f- |f 
? \ 


I : 



710 


THE THEORY OP RESIDUES 


CHAP. 16 


25 Show that r(a)r(l — a) = 7r/(sin air) for 0 < a < 1. [Hint: Consider the integral 

f — dn, and evaluate it first by the method of residues and then by making the 

Jo 1 + y 

substitution y — x/{l — x) and expressing it in terms of gamma functions.] 

26 Show that [ dx = ~ (Hint: Integrate e iz [z around the contour shown in Fig. 16.3, 

Jo x 2 



27 Show that ( a ^dx - 

Jo Vx Jo 

tour shown in Fig. 16.4, let r 

/o ■'-*-£] 


- [Hint: Integrate & z jy/z around the con- 
oo, and recall (Exercise 10, Sec. 7.3) that 



28 If /(z) has a number of first-order poles on the real axis, but otherwise satisfies all the condi- 
tions of Theorem 2, show that the principal value of J e imx f(x) dx is equal to 2?ri times 
the sum of the residues of e* mi /(z) at its poles in the upper half plane plus i-r times the sum of 
the residues of e' mz j{z) at its poles on the real axis. [Hint: Use a contour like that shown in 
Fig. 16.3, suitably indented around each of the poles of f(z) which lies on the real axis.] 


from the point of view of Theorem 3, Sec. 6.3, the limiting behavior of the Fourier coeffi- 
cients of this function as n -> =o . 

Show that f — dx = — [Hint: Integrate the function e im */(e‘ -f- e~ z ) 

J — « e x 4- er x e m * 12 4- 


around the contour shown in Fig. 16.5 and let R - 




SEC. 16.3 


THE COMPLEX INVERSION INTEGRAL 


711 


The complex inversion integral 


We are now in a position to appreciate more fully the significance 
of the complex inversion integral of Laplace transform theory. 
In Sec. 6.8 we defined the Laplace transform of a function /(i) 
to be 

£{/(<)} = f”f{t)<r«dt 

and we showed that conversely 

m = as a: *i aov* 

s being a complex variable. It is interesting now to reconsider 
the derivation of (2) in the light of complex variable theory and 
to investigate how this formula can be applied to the determina- 
tion of a function when its transform is known. 

In the complex plane let 4>{z) be a function of z, analytic on 
the line x = a and in the entire half plane R to the right of this 
line. Moreover, let \<l>(z)\ approach zero uniformly as z becomes 
infinite through this half plane. Then, if s is any point in the half 
plane R, we can choose a semicircular contour C — Ci 4* Ci, 
as shown in Fig. 16.6, and apply Cauchy’s integral formula, 
getting 




, + JL ( Jt 

2iri JCi 3 - 


Now, for values of z on the semicircle C 2 and for b sufficiently 
large, we have 

\z — s| 6 — |s — a| 2s 6 — |$| — |a| 

whether a is positive, as shown in Fig. 16.6, or negative. Hence, 
letting M denote the maximum value of \<t>(z)\ on C 2 , we have 

4>{z) j L f |<K 2 )I i j I M f ,j , irbM 

z — s Z \ = Jet \z — s\ \ ~ b — \s\ — \a\ Jc, \ ~ b — \s\ — |a| 


] fc'i |rfzl ~ i 


The contour used 
to obtain the 
complex inver- 
sion integral. 



712 


THE THEORY OF RESIDUES 


CHAP. 16 


As b becomes infinite, the fraction 
b 

b -\s\- |aj 

approaches 1, and at the same time M approaches zero, since, 
by hypothesis, | <£( 2 :) [ converges uniformly to zero as 2 becomes 
infinite through the right half plane R. Hence, 

lim f dz = 0 

* Jc, Z — s 


and in the limit we have, from (3), 


<£(«) = hm 


J_ f a ~ ib <K g ) 

2 id Jo+ib z — s 


dz 


1 r a+t» 

2 iri Ja~i«> 


+(*) 

s — z 


dz 


Let us now attempt to determine the function of i whose 
Laplace transform is Taking the inverse of <p(s) as defined 
by the last expression, we have 

Assuming that the operations of integrating along the vertical 
line x — a and applying the inverse Laplace transformation can 
be interchanged, the last equation can be written 


m = 



dz 


or, since the operator J3" 1 refers only to the variable s, 

\rh 


Now the specific result 



is known to us through independent reasoning. Hence, we have 
finally 

f(t) = 2 ^. J <t>(z)e u dz 


which, except that the variable of integration is z instead of s, 
is exactly Eq. (2). From this result it is clear that the inversion 
integral is a line integral in the complex plane, taken along a vertical 
line to the right of all the singularities of the transform 4>(s) or along 
any other path into which this can legitimately be deformed. 

In the usual applications, the evaluation of the complex 
inversion integral is accomplished by the method of residues, 
using a semicircular contour whose diameter is the segment join- 
ing the points a — ib and a + ib and whose radius b is large 
enough to ensure that all the poles of the transform are within 
the contour (Fig. 16.7). Specifically, we have the following result: 


SEC. 16.3 


THE COMPLEX INVERSION INTEGRAL 


713 


THEOREM 1 

If <p(s) is an analytic function of s except at a finite number of poles each of which 
lies to the left of the vertical line (R(s) = a and if s<j>(s) is bounded as s becomes 
infinite through the half plane (R(s) ^ a, then 

<£ -1 j <f>(s) } = 2 residues of <f>(s)e st at each of its poles 

PROOF Using the contour shown in Fig. 16.7, we have, by the residue theorem, 



-. 4 >{s)e H ds + — J c i j>(s)e H ds = ^ residues of 


Hence, 

(4) 


id f-T *' M 6 " ds ~2 residues of * « I - - 27; Jc. * (s)e " 


Now along C 2 we have 

s = a + be iB 
and, for sufficiently large s, 


\s4>(s)\ < M and |s — d| £ |s| + |a| < 2|sj 

Therefore, 

I ~ id Jc. * Me “ ds | s h Jc. l*“i 1*1 

= JL f 3r/2 U( S )| |e'-[a+Keos 0+ism »J | 
2w faf 2 

= i X/f l^ s )l I s - a|^ +icOBS) de 
- h i:: 2 ^\^de 

< J- 2Me at F* /2 e btc0 * 9 dd 
~~ 2tT Jr/2 



714 


THE THEORY OF RESIDUES 


CHAP. 16 


If we now set B = w/2 + a and then take advantage of the symmetry of the 
resulting integrand, the last integral becomes 


—■ e at [* e~ ilsitia da 

T JO 


2M e at f* /2 « 


h 


da 


Now it is evident from Fig. 16.8 that 



Hence the last integral is overestimated if we replace sin a in the exponent by the 
smaller positive quantity 2a/ir. Doing this and then performing the integration, 



For t S 0, the last expression clearly approaches zero as b becomes infinite. Hence, 
returning to Eq. (4), it is clear that 


lim [ a+ ' b <j>(s)e !lt ds s [ a+ ^ <t>(s)e at ds 

5->oc uKl Ja~ib ZTTZ 

=■ jb-M *(«)} 

= 2 residues of 4>(s)e H as asserted. 

The proof of the last theorem breaks down if <jb(s) has in- 
finitely many poles, because then, as b — » » , there will always be 
semicircles C '2 on which |s<£(s)| is not bounded. However, by 
choosing a sequence of semicircles whose radii become infinite 
and no one of which passes through a pole of <t>(s), it is possible 
to show that the result of the last theorem is still valid in the case 
when $(s) has an infinite number of poles.* 


((« + o)* + 6*|' 

Using Theorem 1, we have only to compute the residues of 


(s + aY + b 2 


* For a more detailed discussion of this point see, for instance, R. V. Church- 
ill, “Operational Mathematics,” 2d ed., pp. 190-193, McGraw-Hill Book 
Company, New York, 1958. 


SEC. 16.3 


TOE COMPLEX INVERSION INTEGRAL 


715 


at its two first-order poles —a + ib. At s = — a + ib, we have for the residue 


and, at s 


Um [s - (-a + ib)]e“ e (-«+»)j 

*— » — a-K& [s — (-o + ii>)][s — (-a — i6)] ~ 2ib 
— ib, we have for the residue 


lim 

s— > —a—ib 

Hence, by Theorem 1, 


[s - (-a - ib)}e“ 

[s - ( - a +ib)][s - (-a -ib)] 


e i-a-ib)t 

—2ib 


m =£-m«(s)i 


e (~ a +ib)t e (-a-m 

2 ib + -2 ib 



This example, of course, has been merely a new approach to 
a result with which we were already familiar. However, in more 
difficult applications the use of the complex inversion integral 
and contour integration is often either the only or, at least, the 
best way of finding a function when its transform is known. 


EXAMPLE 2 

What is £-1 ^ — ? 

[ $ cosh as j 

Obviously in this case the function 4>(s) has a first-order pole at s = 0. Moreover, since 
cosh as — cos ias [Eq. (13), Sec. 14.7], it follows that 4>(s) has infinitely many other first-order 
poles, namely, the points where 


(2 n - l)v 


However, if we set s = a + ia, we have, by Eq. (17), Sec. 14.7, 

1 I _ 1 

cosh as | cosh 2 act cos 2 ato + sinh 2 a<r sin 2 oto 


cosh 2 cut — sin 2 aco 

and this is bounded on any semicircle which does not pass through one of the poles of 
Hence, the inverse of <t>(s ) is simply the sum of the residues of 



at each of its poles, i.e., the poles of 4>(s). 
At s — 0 the residue is 


lim — - — 
*— *0 cosh as 


1 


and, at s 


(2 n - l)y 
2 ia 


[using l’Hospital’s rule and Eq. (20), Sec. 14.7, to evaluate the inde- 


THE THEORY OF RESIDUES 


CHAP. 16 


If we now set B = t/'2 + & and then take advantage of the symmetry of the 
resulting integrand, the last integral becomes 


~ « ai fj e~ htaina da = ~ fj /2 


* da 


Now it is evident from Fig. 16.8 that 



Hence the last integral is overestimated if we replace sin a in the exponent by the 
smaller positive quantity 2a/ir. Doing this and then performing the integration, 



For t Ijt 0, the last expression clearly approaches zero as 6 becomes infinite. Hence, 
returning to Eq. (4), it is clear that 

1™ ah £» s)e " ds m h IlT 

= 2 residues of <f>(s)e 8 ‘ as asserted. 


The proof of the last theorem breaks down if <t>(s) has in- 
finitely many poles, because then, as b — * « , there will always be 
semicircles C 2 on which |s<£(s)| is not bounded. However, by 
choosing a sequence of semicircles whose radii become infinite 
and no one of which passes through a pole of it is possible 
to show that the result of the last theorem is still valid in the case 
when 4>(s ) has an infinite number of poles.* 

EXAMPLE \ 

What is 

Using Theorem 1, we have only to compute the residues of 
e“ 

(s + a y + 6 2 


* For a more detailed discussion of this point see, for instance, R. V. Church- 
ill, “Operational Mathematics,” 2d ed., pp. 190-193, McGraw-Hill Book 
Company, New York, 1958. 


SEC. 16.3 


THE COMPLEX INVERSION INTEGRAL 


715 


at its two first-order poles — a + ib. At s = — a + ib, we have for the residue 


lim — 
«— > —a+ib [s 


[s — ( — a + z'6)]e*‘ 

(~o +*)][« - (-« — x&)l 


e (- a +m 
2 ib 


and, at s = —a — ib, we have for the residue 

.. [s — ( — a — ib)]e“ 


-a-ib [s - (-a + i6)][s - (-a - ib)] 


Hence, by Theorem 1, 


e (~a+ib)t e (-a-i6)l 

fit) m JB-M*(s)| - + rv- 

2 ib —2ib 


e (- a -ib» 

—2ib 


e~ at sin bt 
" l 

This example, of course, has been merely a new approach to 
a result with which we were already familiar. However, in more 
difficult applications the use of the complex inversion integral 
and contour integration is often either the only or, at least, the 
best way of finding a function when its transform is known. 


EXAMPLE 2 

What is <C -1 — — ? 

[ s cosh as J 

Obviously in this case the function 4>(s) has a first-order pole at s = 0. Moreover, since 
cosh as = cos ias [Eq. (13), Sec. 14.7], it follows that has infinitely many other first-order 
poles, namely, the points where 


ias 


( 2 n - 1 )» 
1 2 


(2 n - l)ir 
2 ia 


n - 1, 2, 3, . . . 


However, if we set s = tr + iu, we have, by Eq. (17), Sec. 14.7, 

1 1 = 1 

cosh as I cosh 2 acr cos 2 aw + sinh® m sin 2 aw 

1 

cosh 2 cur — sin 2 aw 

and this is bounded on any semicircle which does not pass through one of the poles of <j>(s). 
Hence, the inverse of 4>(s) is simply the sum of the residues of 



at each of its poles, i.e., the poles of <£(s). 

At s = 0 the residue is 

lim — - 1 
*— *0 cosh as 

[using I’Hospital’s rule and Eq. (20), Sec. 14.7, to evaluate the inde- 


and, at s 


716 


THE THEORY OF RESIDUES 


CHAP. 16 


terminaey], the residue is 


lim 

(2n — 1): 


[s - (2 n - t)w/2ia]e“ 


(2 n — l)tr (2 n — 1)tt 

— a smh — 

2 la 2t 


2(-l)n e (Sn-l)»i/8io 
(2?l — 1)5T 


Similarly, at s 


2ia 

2( -!)■«-< 


(2n - l)r 

Hence, pairing the terms which correspond to the same value of n, 

fit) - = 1 + ; X 


h !V izi) 

r L 2n- 


(-1)” (2» - 1)t< 


Using the complex inversion integral, find the inverses of the following Laplace transforms. In 
each case discuss the resemblance of the method of residues to the use of the Heaviside expansion 
theorems (See. 7.5). 


(& + !)(« + 3) 

3 JTi 

5 1 

s(s ! + 1) 

7 ( 7 * + 4) 5 

9 i+J 

(s + 2) 2 (s + 3) 


2 

4 

6 

8 

10 


(« + 2) 2 

s 

s z + 4s -f- 13 
s 3 + i 

1 

(s 2 + 9)(s* + 4) 
1 

( S 2 + 2 s + 5) 5 


11 Complete the solution of Exercise 8, Sec. 8.7, by finding the angular displacement at a 
general point x. 

Find the inverse of each of the following transforms: 


12 

15 


s sinh as 

UrVs\ 

sloiy/'s) 


13 


(s + b) cosh as 


sinh x y/ s 
s sinh -y^ 


where / 0 is the modified Bessel function of the first kind. 


16.4 

Stability criteria 

In the analysis of many physical systems a complete description 
of the behavior of the system is unnecessary, and all that is 



:C. 16.4 


STABILITY CRITERIA 


717 


required is a knowledge of whether or not the system is stable, i.e., 
whether its response to a bounded excitation remains bounded or 
becomes infinite as t — * « . As we shall see in this section, this 
question can be answered by analyzing the Laplace transform of 
the response without actually determining the response itself. 

We begin by supposing that, by methods such as those we 
described in Chap. 7, we have obtained the Laplace transform 
of the response of the system £{y(Z)} = 4>(s) and that is a 
rational function; i.e., 


4>(s) = 


m 

Q(t 0 


where P and Q are real polynomials in the complex variable 
s = a + t’w. Now we know from algebra that any polynomial, such 
as Q(s), can always be factored into real linear and quadratic 
factors that may or may not be repeated. Moreover, we know from 
the Heaviside theorems (Sec. 7.5) that the form of the inverse 
y(t) = £ _1 { <f>(s) } is determined completely and solely by the 
factors of Q(s ) and that the only terms which can possibly occur 
in it are the following: 


Factor 


Term 


From unrepeated factors 


1. s 1 

2. s 2 -f- fo® cos bt, sin bt 


4. (s - a) 2 + b* 


e at cos bt, e at sin bt 


From repeated factors 


5. s",n> 1 i k , 0 < k < n - 1 

6. (s 2 + 6 2 ) n , n > 1 t k cos bt, t k sin bt, 0 < k in - 1 

7. (s - a)", n > 1 t k e at , 0<i£n-l 

8. [(s — a) 2 + b 2 ] n , n > 1 cos bt } t k e at sin bt, 0 < k g n — 1 


Clearly, terms of the forms 1 and 2 are stable in all cases, for, 
although they do not approach zero as t — * °o , they do remain 
finite. Terms of the forms 3, 4, 7, and 8 are stable if and only if a 
is negative, in which case they “not only remain finite but in fact 
approach zero as t — * °o . Terms of the forms 5 and 6 are unstable 
in all cases, since, because of the factor t, each becomes unbounded 
as i5 — > oo . Translating these observations into conditions on the 
roots of the polynomial equation Q(s) = 0, we see that the response 
y(t) will be stable if and only if the following conditions are met: 

a Every unrepeated real root is nonpositive, 
b Every repeated real root is negative, 
c Every pure imaginary root is unrepeated, 
d Every general complex root has negative real part. 


718 


THE THEORY OF RESIDUES 


CHAP. 16 


Geometrically speaking, these conditions can be described as 
follows: 


THEOREM 1 
In order for the function 


y{t) 



to be stable, it is necessary and sufficient that the equation Q(s) =0 have no roots 
to the right of the imaginary axis in the complex s-plane and that any root on the 
imaginary axis in the s-plane be unrepeated. 


V arious methods are available for determining whether or not 
the roots of a polynomial equation all have nonpositive real parts. * 
In general, however, these are more conveniently formulated as 
methods for determining whether or not the roots all have real 
parts that are strictly negative, and most, though not all, of our 
results will be of this nature. This is not a serious disadvantage, 
because in practice zero roots and pure imaginary roots, i.e., 
roots whose real parts are zero, if they occur at all, are usually 
easily recognizable. 

A preliminary result of considerable importance is contained 
in the following theorem: 


THEOREM 2 

The real part of each root of the polynomial equation Q(s) = 0 is less than or equal 
to zero only if the coefficients in Q(s) all have the same sign. 

PROOF We observe first that it is no specialization to interpret the condition 
of the theorem as asserting that all coefficients in Q(s) are positive. For the 
ease in which all coefficients are negative can be converted into the case 
in which all coefficients are positive, and vice versa, simply by multiplying 
Q(s) — 0 by —1, which, of course, in no way alters the roots of this equation. 
Now, if every root of Q(s) — 0 has nonpositive real part, then the only possible 
factors of Q(s) are of the forms 

s + Oi and ( s + aj) 2 + where a,-, a ,• Ss 0 

Since these factors contain only nonnegative terms and since Q(s ) is simply the 
product of a finite number of these factors, it is clear that every nonzero coefficient 
in Q(s) must be positive, as asserted. 

It is also clear from the preceding argument that, if every a 
is positive, so that all roots of Q(s) — 0 have real parts strictly 
negative, then there can be no zero coefficients in Q(s ); i.e., all 
terms must be present. Hence, restating this observation con- 
trapositively, we have the following corollary: 


* See, for instance, A. Bronwell, “Advanced Mathematics in Physics and 
Engineering,” pp. 386-413, McGraw-Hill Book Company, New York, 1953, 
and E. A. Guillemin, “The Mathematics of Circuit Analysis,” pp. 395-409, 
John Wiley & Sons, Inc., New York, 1953. 


SEC. 16.4 


STABILITY CRITERIA 


7 19 


COROLLARY 1 

If one or more terms are missing from Q(s), then the equation Q(s) = 0 has at 
least one root whose real part is nonnegative. 

The condition of Theorem 2 is only a necessary and not a 
sufficient one ; that is, it cannot be asserted, conversely, that, if the 
coefficients in Q(s) all have the same sign, then the real part of 
each root of Q(s ) = 0 is nonpositive. For instance, 

S 4 _|_ S 3-_|_ S 2 + llfi _|_ 1Q 

contains only terms with positive coefficients; yet the roots of the 
equation 

s 4 + s 3 + s 2 + 11s 10 = 0 

are s — — 1, —2, 1 ± 2 i 

and the two complex roots have positive real parts. On the other 
hand, it is clear from Theorem 2 that we do have the following 
result : 

COROLLARY 2 

If Q(s) contains some terms with positive coefficients and some terms with negative 
coefficients, then the equation Q(s) — 0 has at least one root whose real part is 
positive. 

For quadratic equations the necessary condition of Theorem 
2 is also sufficient. For if the equation o 0 s 2 -j- ais +• 02 = 0 con- 
tains no negative coefficients, then its roots 

— ~~ ai — V a * 2 ~~ 4aoQ2 
2ao 

are clearly either nonpositive real numbers or conjugate complex 
numbers with nonpositive real parts. 

For cubic equations, a sufficient condition, supplementing 
Theorem 2, is contained in the following result: 


THEOREM 3 

A necessary and sufficient condition that every root of the cubic equation 
OoS 3 + ais 2 + a 2 s 4 a 3 = 0 have negative real part is that all coefficients have 
the same sign and that aia 2 — a 0 a 3 > 0. 

PROOF Let us assume for definiteness that the given equation has one real 
root r and one pair of conjugate complex roots p + iq. The case in which the equa- 
tion has three real roots can be handled in exactly the same fashion. From algebra 
we recall that the roots, say rj, r 2 , r- 5, of any cubic equation are related to the 
coefficients, through the equations 

— = — (n + r % -f r 3 ) 

®o 

C&2 ■ 1 

— = nr 2 4- r 2 r 3 4- r 3 r 1 
«0 

Oj 
Ofl 


— 



720 


THE THEORY OF RESIDUES 


CHAP. 16 


In the present case these become 


(1) 

Y = -\r + 2p) 

Go 

(2) 

—• = p 2 + q 2 4- 2 pr 

(3) 

^ = -r(p 2 4- 9 2 ) 
a 0 


From (3) and the assumption that the o’s all have the same sign, it follows that 
r < 0. To prove that p < 0, we note that the condition a x ai — o 0 as > 0 can be 
rewritten, after division by a\, as 
Qi o 2 as ^ q 

Go Go Gq 

When the ratios of the o’s are replaced by their equivalents from (1), (2), and (3), 
this becomes 

~(r + 2 p)(p 2 + q 2 + 2 pr) + r(p 2 + q 2 ) > 0 
or, simplifying and rearranging, 

(4) ~2p[(p 2 + q 2 + 2 pr) + r 2 \ > 0 

Now from (2) and the hypothesis that the o’s are all of the same sign, it is evident 
that p 2 + q 2 + 2 pr > 6. Hence, (p 2 -f $ 2 + 2pr) + r 2 > 0, and it follows from 
(4) that p < 0, as asserted. This proves the sufficiency of the conditions of 
Theorem 3. 

The necessity that all the coefficients have the same sign follows immediately 
from (1), (2), and (3), since the right-hand sides of these relations are all positive 
if p < 0 and r < 0. The necessity of the condition GiG 2 — 0 o 03 > 0 follows by 
reversing the above steps and working backward to this inequality from (4), 
which is surely true if p < 0 and r < 0. 

The extension of Theorem 3 to polynomial equations of higher degree is 
contained in the next theorem, which we state without proof.* 

THEOREM 4 

In the polynomial equation 

Q(s ) = Gos" + GiS n_1 + aiS n ~ 2 + • • • 4- o„_iS + a n = 0 
let every coefficient be positive, and construct the n quantities 





Gi Go 



Oi 

Go 

0 


D x = . 

ai 

d 2 = | 

D S 

= 

03 

g 2 

Gl 


G 3 a% 1 






05 

O 4 

g 3 



Gi 

Go 

0 

0 

0 


0 




0-3 

a 2 

Gi 

Go 

0 


0 



D n » 

a & 

a 4 

g 3 

g 2 

Gi 


a 0 




G 2 »-] 

L G 2n _2 

Ozn-3 

0271-4 

G 2n 

-5 

a 2 „_ 

8 

’ ' On 


* See, for instance, J. V. Uspensky, “Theory of Equations,” pp. 304-309, 
McGraw-Hill Book Company, New York, 1948. 


SEC. 16.4 


STABILITY CRITERIA 


721 


where, in each determinant, all a’s with negative subscripts or with subscripts 
greater than n are to be replaced by zero. Then a necessary and sufficient condi- 
tion that each root of Q (s) =0 have negative real part is that each D n be positive. 

This is commonly known as the Routh or Routh -Hurwitz stability 
criterion. 


EXAMPLE 1 

For the equation s 6 + s 4 + 2s 3 + s 2 + s + 2 = 


1 1 0 
1 2 1 


1 10 

d 3 = i 2 i = : 
1 2 1 1 1 
1 10 0 0 
12 110 
2 112 1 
0 0 2 1 1 
0 0 0 0 2 


Since not all of the D's are positive, the given equation has at least one root whose real part is 
nonnegative. This can be confirmed, of course, by actually finding the roots of the given equa- 
tion, which are, in fact 

n — 1 n, r 3 st 1,£ ± i V3/2 U, r 5 = ± i \/7/2 

A somewhat different method of obtaining information about 
the location of the roots of an equation /(z) = 0, which has the 
advantage that it tells exactly how many roots there are with 
positive real parts and, moreover, is not restricted to the case 
where /(z) is a polynomial, is based on the following theorem: 


THEOREM 5 

If /(«) is analytic within and on a closed curve C except at a finite number of poles 
and if /(z) has neither poles nor zeros on C, then 


j_ r'M 

2 ri Jc f(z) 


dz = N - 


P 


where N is the number of zeros of /(z) within C, and P is the number of poles of 
f(z ) within C, each counted as many times as its multiplicity. 

PROOF Suppose first that, at a point z = a* within C,f(z ) has a zero of order 
n*. Then / (z) can be written 

f(z) = 0 — a k ) n ^{z) 


where is nonvanishing and analytic in some neighborhood of z — a k . From 
this f'(z) = n k (z - a k ) nk ~ l <f>(z) + (z — a k ) nk <j>'(z) 

f(s) _ n k {z — a k ) nk ~ 1 <t>(z) + (z — a k ) nk 4>'(z) _ n k 
/(z) (z — a k ) nk <t>(z) z — a k <H 2 ) 


and thus 


Since <£(z), and hence is analytic at z — a k and since <£(z) does not vanish at 
z = a k , the fraction <f/ (z) / <f>(z) is analytic at z = a*. Hence it is clear from the last 
expression that /'(z)//(z) has a simple pole with residue n k at every point a k 
where /(z) has a zero of order n k . Similarly, if /(z) has a pole of order p k at the 


722 


THE THEORY OF RESIDUES 


CHAP. 16 


point z : 


= b k , we can write 

/GO = 7 




C—Pk+l 


+ J- + Co + 

z — Ok 


(z •— &Jfc) p * ' (2 — &fe ) p * -1 
Hence, putting these fractions over a common denominator, we have, in the 
neighborhood of z — bk, 


/(*) = (j T ~ b k yt ^ = ( g " b k)~ Ph Hz) 

where ^(z) = c_ pt 4- c_ PJfc+ i(z — &*) + c_ Pt+2 (z — 6*0 2 -f * • • 

is obviously analytic and nonvanishing atz = bk. Therefore, around &*, 


f(z) = —Pk(z - b k )- p “-hj,(z) + (z - bk)- p y'(z) 


and thus 


f(z) _ -p k (z - &fc)~ p *~V( z) + 0 - bk)~ Pk f(z) -Vk , f’(z) 
fit) (2 - b k )~ Vk Hz) z — b k $(z) 


The last fraction on the right is clearly analytic; hence, f(z)/f(z) has a simple 
pole with residue — p k at every point where /(z) has a pole of order pk. Applying 
the residue theorem to f(z)/f(z) over the region bounded by C, we therefore have 


dz = 2^1 residues = 2 iri Q? n * ~ ^ Vk) = 2 iri(N — P) 

since 2 n k is the total multiplicity N of all the zeros of/(z) within C and 2 p k is the 
total multiplicity P of all the poles of f(z) within C. Dividing by 2 we obtain 
the assertion of the theorem. 


An important alternative form of the last theorem can be 
derived by noting that 

a 

Hence, performing the integration, 

N — P = — [variation of ln/(z) 3 = In |/(z)| + i arg/(z) 

in going completely around C ] 

Clearly, In |/(z)| is the same at the beginning and at the end of 
any closed curve, and therefore 

N — P — — [variation of i arg /(z) around C] 

_ variation of arg /(z) around C 
2tt 


In particular, if /(z) is analytic everywhere within C (so that 
P — 0), we have the following important result, commonly known 

as the principle of the argument : 


COROLLARY 1 

If/(z) is analytic within and on a closed curve C and does not vanish on C, then 
the number of zeros of/(z) within C is equal to l/2v times the net variation in the 
argument of /(z) as z traverses the curve C in the counterclockwise sense. 



SEC. 16.4 


STABILITY CRITERIA 


723 


FIGURE 16.9 

A semicircular 
contour enclosing 
all zeros of a func- 
tion which lie in 
the right half 
plane. 


In geometric terms, this means that, if the locus of w - f(z) is 
plotted for values of z ranging around the given contour C, then 
the number of times this locus encircles the origin in the w;-plane 
is the number of zeros of f(z) within C. Moreover, since /O) = 0 
implies w = 0, it is evident that if f(z) has a zero on C, the image 
curve passes through the origin in the re-plane. 

To use the last theorem and its corollary to determine whether 
or not each of the roots of a polynomial equation Q(z) ~ 0 has 
negative real part, we proceed as follows. In the 2 -plane let the 
contour C consist of the segment of the imaginary axis between 
— R and R and the semicircle lying in the right half plane and 
having this segment as diameter (Fig. 16.9). Since a polynomial 
equation has only a finite number of roots, it is clear that, if R is 
taken sufficiently large, any roots of Q(z) — 0 which lie in the 
right half plane, i.e., any roots which have positive real parts, 
will lie within C. 



Now let z range over the contour C, and in an auxiliary w~ 
plane let the locus of the corresponding values of w — Q(z) be 
plotted. If this curve does not enclose the origin in the re-plane, 
then according to the corolla^ of Theorem 5, Qiz) = 0 has no 
roots in the right half plane. If, further, this curve does not pass 
through the origin in the re-plane, then Q(z) — 0 has no roots on 
the imaginary axis either; i.e., all roots of Q(z) — 0 have negative 
real parts. On the other hand, if the image curve encircles the 
origin in the re-plane a net number of times k, then Q(z) = 0 has 
k roots in the right half plane, i.e., has k roots with positive real 
part. Moreover, for every time this curve passes through the 
origin in the re-plane there is a root of Q(z) — 0 lying on the 
imaginary axis in the 2 -plane. Distinct pure imaginary roots of 
Q(z) = 0 thus give rise to a multiple point at the origin in the 
re-plane, the tangents at the multiple point being distinct. A 
repeated pure imaginary root in the 2 -plane similarly gives rise, 
in general, to a cusp at the origin in the re-plane. 

The labor of plotting the image curve in the rr-plane can be 
reduced considerably by letting R —> . The image of the semi- 

circular portion of C then recedes to infinity in the re-plane, and 



724 


THE THEORY OF RESIDUES 


CHAP. 16 


without any plotting, its contribution to possible encirclements 
of the origin can be determined as follows : On the semicircle we 
have 


For the images of these values of z we have 
w = Q(Re ie ) = a 0 (Re ie )" + a 1 (i2e ifl )»- 1 + ■ • • + a n 

Now, for arbitrarily large values of R, all terms in Q(Re it ) after 
the first are negligible in comparison with the first term a 0 R n e inB . 
Hence, as z traverses the semicircular portion of C in the positive 
direction, with 0 — arg z varying from —r/2 to tt/2, the argu- 
ment of its image 

w = a 0 R n e inB 

varies from —nr/ 2 to rnr/2, which represents a net variation in 
arg w, that is, arg Q(z), of nr. Hence if w - Q{z) is plotted only 
for z varying from i<x> to — £« along the imaginary axis and the 
net change in the argument of w is noted, with its proper sign, of 
course, this change plus nr will give the net change as the entire 
contour C is traversed. This change, divided by 2ir, gives the net 
number of times the image curve encircles the origin in the w- 
plane, and this number is equal to the number of roots of Q(z) — 0 
in the right half of the 2 -plane. The labor of plotting can be still 
further reduced by noting that, for polynomials with real coeffi- 
cients, such as we encounter in Laplace transforms, we have 

Q(2) - W) 

and, hence, the plot of Q{z) for values on the lower half of the 
imaginary axis is just the reflection in the real axis of the plot of 
Q(z) for values of z on the upper half of the imaginary axis. 


EXAMPLE 2 

Discuss the stability of y(t) if = (s- + l)/(s 3 + s 2 -f- 4s + 1). 

As we pointed out above, the stability of y{t) is determined solely by the location of the zeros 
of the denominator of y(t). Hence, we begin by plotting 

w = Q{s) = s 3 + s 2 4- 4s 4- 1 

for values of s on the imaginary axis, i.e., for s = ico and w ranging from so to — °o . The para- 
metric equations of the image curve are easily obtained, for 

Q(lta) = — ico 3 — co 2 -f- 4ico + 1 
and so the real and imaginary parts of w — u + iv are 
u = 1 — « 2 and v = 4co — to 3 

Figure 16.10 shows a plot of this curve together with a plot of arg w. Evidently, as s traverses the 
imaginary axis from i'oo to -i», arg w varies from Zr/2 to — 3 jt/ 2, which is a net variation of 
— 3tt. This, added to the value mr = 37r contributed by the semicircular portion of the contour C 


SEC. 16.4 


STABILITY CRITERIA 


755 




(Fig. 16.9), gives a net variation of zero as the entire contour C is traversed. Hence, Q(n) has no 
zeros in the right half of the s-plane. Moreover, since the image curve does not pass through the 
origin in the ic-plane, Q(s) has no zeros on the imaginary axis. Therefore, by our earlier discus- 
sion, the inverse y(t) is stable. 

EXAMPLE 3 

Discuss the stability of y(l) if £ ( y(l) } = (s — 2)/(« a + s i +• s + 4). 

Proceeding exactly as in Example 2, we obtain from 

Q(«o) - -W -«• + *»+ 4 

the parametric equations 

u *= 4 — oi* and v = to — o> 3 

and the image curve shown in Fig. 16.11. In this case, as s traverses the imaginary axis from i <*> 
to —too, arg w varies from 3r/2, as in Example 2, to 5r/2, which is a net variation of 5ir/2 — 3ir/2 
=» v. Hence, adding the variation »*• » 3*r contributed by the semicircular portion of the con- 
tour C (Fig. 16.9), we obtain 4sr for the net variation in arg w as the entire contour C is traversed. 
Dividing this by 27r, we obtain 2 as the number of zeros of Q(s) in the right half plane. The 
inverse in this ease is, therefore, unstable. 



726 


THE THEORY OF RESIDUES 


CHAP. 16 



Theorem 5 finds its best-known application in the so-called 
Myquist stability criterion, which is a modification of the preced- 
ing process especially well adapted to the stability analysis of 
closed-loop control systems. One common problem in engineering 
is to make the output x 0 (t) of a system follow quickly and accu- 
rately changes made in the input Xi(t ) to the system. In an open- 
loop system, such as that shown in Fig. 16.12a, this is often diffi- 
cult to accomplish; specifically, prolonged oscillation of x„(t) 
about its desired value may well follow an abrupt change of the 
input Xi(t) to some desired new value. One possible way to remedy 
this situation is to construct a feedback loop, such as the one 
shown in Fig. 16.126, which will sample the output and feed it 


FIGURE 16.12 

Systems with 
feedback loops. 




SEC. 16.4 


STABILITY CRITERIA 


727 


back to a differential device which will in turn transmit the error 
signal Xi(t) — x„(t) as a modified or corrected input to the original 
system. More generally, the output x 0 (t) may be and usually is 
modified by some additional device in the feedback loop to pro- 
duce the feedback signal x f (t) before it is fed to the differential 
(Fig. 16.12c). 

In Fig. 16.12c, let Gi(s) and G 2 (s) be the transfer functions 
of the original system and the feedback loop, respectively. Then, 
from the definition of a transfer function as the ratio of the 
transformed output to the transformed input (Sec. 7.7), we can 
write 

£{a: 0 (f)} - G i(s) [£ {**(«)} -£{*/(«)}] 

If we eliminate } between these two equations, we obtain 

at once 


£{*•(«)} 


Gi(s) 


T £{*f(f)} 


' 1+G, «<?,(•)* 

Evidently (?i(s)/[l + Gi(s)G«(s)} is the over-all transfer function 
of the entire closed-loop system. 

The question of the stability of a feedback system is of great 
importance and, as we discussed above, can be answered by an 
examination, of the Laplace transform of the output, namely, 
Gi(«) 


1 + Gi(s)Gn.(s) 


£{®(01 


Now, if the original system without the feedback loop is stable for 
the input Xi(t), as we shall suppose, then the product (7i(s)«C{a;i(f) } 
can have no poles in the right half of the s-plane, and the stability 
of the over-all system depends solely on the location of the zeros 
of the denominator, 


1 + G l (s)G i (s) 

Hence, as before, we plot the locus of the function 


w(s) - 1 + Gi(s)Gt(s) 


as s ranges over the contour of Fig. 16.9. 

In this case, since Gi(s) and G 2 (s) are themselves Laplace 
transforms, each approaches zer'o as R becomes infinite (Corollary 
1, Theorem 5, Sec. 7.1). Hence, the image of the semicircular 
portion of the contour C shrinks to the single point w = 1 as 
R — > * . Thus, to determine stability, it is necessary only to plot 
w(s ) - 1 + Gi(s)G 2 (s) for values of s on the imaginary axis and 
determine whether or not the resulting curve encloses the origin. 
Moreover, as we pointed out above, this curve can be constructed 
simply by plotting 1 + Gx{iu)G 2 {iw) for positive values of co and 
then reflecting the resulting arc in the real axis. In practice, 
instead of plotting w = 1 + Gi(u»)G%(ia>) and observing whether 


728 


THE THEORY OF RESIDUES 


CHAP. 16 


or not the image curve encircles the origin, it is customary to 
plot w = Gi(ta)Gi(m) and observe whether or not it encircles 
the point w = — 1. The equivalence of these two procedures is 
obvious. 

It would take us too far afield and involve us in too many 
details of a purely engineering nature to discuss the application 
of the Nyquist stability criterion to specific, nontrivial closed- 
loop systems. Such applications appear in large numbers in books 
on servomechanisms, and to these we must refer for illustrations 
and further information.* 

EXERCISES 

Using the geometric approach based on the corollary of Theorem 5, determine whether or 
not the following equations have any roots with nonnegative real parts. Check by using 
Theorem 4. 

1 s* + s + 9 <= 0 2 s 3 + 6s 8 + 10s + 6 - 0 

3 + 2s 3 + 7s 3 + 4s + 10 - 0 4 s* + s 3 + s 2 + 10s + 10 - 0 

5 Prove Theorem 3 on the assumption that the cubic has three real roots. 


* See, for instance, G. J. Thaler and R. G. Brown, "Analysis and Design 
of Feedback Control Systems," 2d ed., McGraw-Hill Book Company, New 
York, 1960, or H. Chestnut and R. W. Mayer, “Servomechanisms and Regu- 
lating System Design,” John Wiley & Sons, Inc., New York, 1951. 


CHAPTER SEVENTEEN 


Conformal Mapping 


17.1 

The geomefrica! representation of functions of z 

Although in the last section we plotted the values of a function 
w ~ f(z) for certain values of z, namely, those on a particular 
semicircular contour, we have not as yet attempted to provide a 
geometrical representation for w — f(z) when z ranges over the 
entire complex plane. To do so now requires a decided departure 
from the conventional methods of cartesian plotting, which 
associate a curve with a real function y — g(x) and a surface with 
a real function z — h(x,y). In the complex domain, a functional 
relation w — f(z), that is, 

u + iv = f(x + iy) 

involves four real variables, namely, the two independent variables 
a: and y and the two dependent variables u and v. Hence, a space 
of four dimensions is required if we are to plot w ~ f(z) in the 
cartesian fashion. To avoid the difficulties inherent in such a 
device, we choose instead to proceed as follows: 

Let there be given two planes, one the 2-plane, in which the 
point z — x + iy is to be plotted, and the other the w-plane, in 
which the point u + iv is to be plotted. A function w — f(z) is 
now represented not by a locus of points in a space of four dimen- 
sions but by a correspondence between the points of these 
two cartesian planes. Whenever a point is given in the 2-plane, 
the function w — f(z) determines one or more values of u + iv 
and, hence, one or more points in the w-plane. As z ranges 
over any configuration in the 2-plane, the corresponding point 
u + iv describes some configuration in the iw-plane. The function 
w = f(z) thus defines a mapping or a transformation of the 
2 plane onto the w-plane and, in turn, is represented geometrically 
by this mapping. 


730 


CONFORMAL MAPPING 


CHAP. 17 


EXAMPLE 1 

Discuss the way in which the z-plane is mapped onto the re-plane by the function w = z 2 . 

In this case we have u + iv — (x + iy ) 2 = ( x * — y 2 ) + 2ixy, and thus 

(1) u = x 2 — y 2 v — 2 xy 

These are the equations of the transformation between the two planes. From them, many 
features of the correspondence can easily be inferred. 

For instance, lines parallel to the y-axis, i.e., lines with equations x = ci, map into curves in 
the w-plane whose parametric equations are, from (1), 

u - ci 2 — y 2 v = 2 ciy 

Eliminating the parameter y, we obtain the equation 



This defines a family of parabolas having the origin of the w-plane as focus, the line v — 0 as 
axis, and all opening to the left (Fig. 17.1). Similarly, lines parallel to the .r-axis, i.e., lines with 




Plot showing the mapping of certain lines by the function w = z 2 . 


SEC. 17.1 


THE GEOMETRICAL REPRESENTATION OF FUNCTIONS OF z 


731 


equations y — c s , map into curves in the 10 -plane whose parametric equations are 
u — x- — c 2 2 v = 2ciX 
Eliminating x, we obtain 

v- 



which is the equation of a family of parabolas having the origin as focus, the line 
but this time all opening to the right. 

Mapping from the w-plane back onto the 2 -plane is even more immediate, 
lines u — ki correspond to the rectangular hyperbolas 
x 2 — y* = fcj 

The lines v — k* correspond to the rectangular hyperbolas 
xy — Hh 

The images of other curves, or regions, can, with varying degrees of difficulty, be found in 
the same fashion. For instance, to find the curve into which the line 
?/ = 2x + l 


v = 0 as axis, 
From (1), the 


z plane 

y 


w plane 



FIGURE 17.2 
Plot illustrating 
the two-valued 
character of the 
mapping defined 
by a = w !i . 




\y 


y 



732 


CONFORMAl MAPPING 


CHAP. 17 


is transformed, we must eliminate x and y between this equation and the equations of the 
transformation. To do this, we first substitute for y in Eqs. (1), getting 
u = x 2 - (2x 4 l) 2 = -3a: 2 - 4x - 1 
v = 2x(2x 4 1) = 4a: 2 4 2a: 

Solving these equations for x and a: 2 , we find at once 

4u 4 3u 4 4 u + 2v 4 1 

X = : — x° — — 

-10 5 

Hence, 

or 16u 2 + 24.UV 4 9i> 2 4 l?w — 16w = 4 

which is the equation of a parabola. 

Although w is a single-valued function of z, the converse is not true. In fact, when w is given, 
z may be either of the two square roots of w. Because of this, the mapping from the z-plane to 
the u ! -planc covers the latter twice, as Fig. 17.2 shows. This, of course, is nothing but a graphic 
representation of the how familiar fact that the angles of complex numbers are doubled when the 
numbers are squared. 

EXERCISES 

1 Discuss the mapping between the z- and w-planes defined by the function w = (2) 2 . 

2 Discuss the transformation between the z- and w-planes defined by w = x — iy. 

3 What relation, if any, exists between the transformations w =/(z) and w - f(z)? 

4 Discuss the transformation defined by w = 2iz 4 1. 

5 Discuss the transformation defined by w — (x 2 — y*) 4 ixy. In what significant way does it 
differ from the transformation defined by w — z 2 — ( x 2 — y 2 ) + 2ixy7 

0 Discuss the transformation defined by w = z 3 . Plot the image of the line u ~ 1. What is 
the equation of the image of the line x — 1? 

7 Discuss the transformation defined by w = z*. Plot the image of the line u = 1. What is the 
equation of the image of the line a: = 1? 

8 Discmss the transformation defined by the function w = 1/z, Plot the image of the square 
whose vertices are the points z = 1 4 i, 2 4 i, 2 + 2i, 1 4 2 i. 

9 Find the equations of the transformation defined by the function (z — i)/z, and show that 
every circle through the origin in the z-plane is transformed into a straight line in the tr-plane, 

10 Discuss the transformation defined by w — e-. What is the equation of the image of the line 
x 4 y — 1? 


Uu + Zv + 4 y 

V -io ) 


17.2 

Conformal mapping 

In the last section we saw that every function of a complex var- 
iable maps the xy-phme onto the uv-plme. We now propose to 
investigate in more general terms the character of this trans- 
formation when the mapping function w — u(x,y) -j- iv(x,y) is 
analytic. 

At the outset it is important to know when the transformation 
equations can be solved (at least theoretically) for x and y as 
single-valued functions of u and v ; that is, when the transforma- 
tion has a single-valued inverse. The condition for this, as estab- 


SEC. 17.2 


CONFORMAL MAPPING 


733 


lished in most texts on advanced calculus,* is simply that the 
Jacobian determinant of the transformation, 
du du 
dx dy 
dv dv 
dx dy 

be different from zero. Since w = f(z) is assumed to be analytic, 
u and v must satisfy the Cauchy-Riemann equations. Hence, 
substituting into the Jacobian, we have 
| du dv | 



dx dx 



which establishes the following result: 


THEOREM 1 

If f(z) is analytic, the transformation iv — f(z) will have a single-valued inverse 
in the neighborhood of any point where the derivative of the mapping function is 
different from zero. 


Exceptional points where f (z) — 0 are known as critical points 
of the transformation. 

Now consider a value z and its image w — f(z), where /(a) is 
analytic, and let 
Az = \Az\e i0 and Aw — 


be corresponding increments of these quantities (Fig. 17.3). Then 

f(z) = lim = lim - lim 

A«— *o A z A .-.0 \Az\e l ° A 2-‘0 \ |Aa| ) 


From this it is apparent that 

lim "jT~r = |/'(«)| and lim (0 - 0) ~ arg f(z) 

Az~»0 |AS| A,?— *0 

or, to an arbitrary degree of approximation, 


(1) \Aw\ = |/'(z)| |As| 

and 9 + arg/' (z), or 

(2) arg Aw = arg Az -f arg f{z) 


FIGURE 17.3 

Plot showing As 

' 

Ay/ \ 


and its image Aw 

Am\ 


under a mapping 

w - /(*)• 

z V— 

w 



V 

! M 


* See, for instance, R. C. Buck, “Advanced Calculus,’’ p. 215, McGraw-Hill 
Book Company, New York, 1956. 


734 


CONFORMAL MAPPING 


CHAP. 17 


Now the fact that/'(z) exists [which, of course, it does, since 
f(z) is assumed to be analytic] means that both |/'(z)| and arg f(z) 
are independent of the manner in which A z — » 0. In other words, 
they depend solely on z and not on the limiting orientation of the 
increment As. Hence, from (1) we draw the following conclusion: 

THEOREM 2 

In the mapping defined by an analytic function w — f(z), the lengths of infini- 
tesimal segments, regardless of their direction, are altered by a factor \f(z)\ 
which depends only on the point from which the segments are drawn. 

Since infinitesimal lengths are magnified by the factor |/'(z)|, it 
follows that infinitesimal areas are magnified by the factor 
|/'(z)! 2 , that is, by J(u,v/x,y), 

Similarly, we conclude from (2) that the difference between 
the angles of an infinitesimal segment and its image is independent 
of the direction of the segment and depends only on the point 
from which the segment is drawn. In particular, two infinitesimal 
segments forming an angle will both be rotated in the same direc- 
tion by the same amount ; hence, the measui’e of the angle between 
them will in general be left invariant by the transformation. 

However, when/'(z) = 0, ar gf(z) is undefined, and we can- 
not assert that angles are preserved. To investigate this case, 
suppose that f(z) has an n-fold zero at z — z a . Then f(z) must 
contain the factor ( z — zo) n , and, hence, we can write 

f{z) = (» + l)a(z - soY + (n + 2 )&(* - z 0 ) n+1 + * • • 

where a, b, . . . are complex coefficients of no concern to us and 
the factors n.+ 1, n + 2, . . . have been inserted for conven- 
ience in integrating /'(z) to obtain f(z ) : 

/(*) - /0o) + a(z - z Q ) n+1 + b(z - Zo) n+2 + ' ■ • 

If in this expression we transpose /(zo)> set 
z — Zo = Az f(z) — f(z 0 ) — Aw 


and divide by a(Az) n+1 , we obtain 


a(A2) n+1 


= 1 + - Az + 


As Az — » 0, the right member approaches 1. Therefore, 


lim (arg Aw) — lim arg a(Az) B+1 = arg 1 = 0 

A2 — >0 Az — 

or, to an arbitrary degree of approximation, 


arg Aw - arg a -f (n + 1) arg Az 

Now let Azi and A z% be two infinitesimal segments which make an 
angle 6 with each other, and let Awi and Aw 2 be their images. 



SEC. 17.2 


CONFORMAL MAPPING 


735 


From the last expression we have 
arg Aw i = arg a + (n + 1) arg Az x 
arg Awz = arg a + (n + 1) arg A z 2 
Hence, subtracting, 

arg Aw-i — arg Awi — (n l)(arg Az 2 — arg AzO = (n + 1)0 
Thus we have established the following theorem: 


THEOREM 3 

In the mapping defined by an analytic function w — f(z), angles are in general 
preserved in magnitude and in sense. The only exception to this occurs when the 
vertex of the angle is an n-fold zero of f'(z), in which case the angle is altered by 
the factor n + 1. 

Example 1 of the last section is an excellent illustration of 
the behavior described by Theorem 3. The mapping function 
w ~ f(z) = z 2 is everywhere analytic, and, as Fig. 17.1 indicates, 
angles are in general preserved. However, the derivative /'(s) = 2 z 
has a simple zero at z — 0, and, as Fig. 17.2 indicates, angles with 
vertex at the origin are not preserved, but instead are doubled. 

A transformation which preserves the magnitudes of angles 
is said to be isogonal. A transformation which preserves the sense 
as well as the magnitudes of angles is said to be conformal. If 
j\z) is an analytic function, it follows from Theorem 3 that, in 
the neighborhood of any point where/ ( 2 ) ^ 0, the transformation 
defined by w — f(z) is conformal. Conversely, it can be shown* 
that, if the mapping 

■u = u(x,y) v = v(x,y) 

is conformal and if the first partial derivatives of u and v are con- 
tinuous, then w — u + iv — f(z) is an analytic function. Because 
of the properties guaranteed by Theorems 2 and 3, it is clear that 
under a conformal transformation any infinitesimal configuration 
and its image conform, in the sense of being approximately sim- 
ilar. This is not true, however, for large configurations which may 
bear little or no resemblance to their images. 

One important reason for studying conformal transformations 
is that solutions of Laplace’s equation remain solutions of Laplace’s 
equation when subjected to a conformal transformation. More 
precisely, we have the following theorem: 


THEOREM 4 

If <}>(x,y) is a solution of the equation 
, cTV . n 
dx 2 dy 1 


* See, for instance, E. G. Phillips, "Functions of a Complex Variable,” pp. 
35, 36, Interscience Publishers, Inc., New York, 1945. 



736 


CONFORMAL MAPPING 


CHAP. 17 


then when cj>(x,y) is transformed into a function of u and v by a conformal trans- 
formation, it will satisfy the equation 


3 2 0 , 3 2 0 
du 2 dv 2 


= 0 


everywhere except possibly at the images of the points where the derivative of the 
mapping function is equal to zero. 


PROOF Let w = u{x,y) + iv(x,y) define a conformal transformation by means 
of which <t>(x,y) is transformed into a function of u and v. Then 

30 _ 30 du 30 dv , 30 _ 30 du . 30 dv 

dx du dx dv dx an dy ~du By dv dy 
A second differentiation of each of these yields the results 

3 2 0 __ 30 dhi . ( 3 2 0 du . 3 2 0 3a\ du , 30 d^v . ■ / _3^0_ du , 3 2 0 3iA dv 

dx 2 du dx 2 \da~ dx dvdudx/dx dv dx~ \dii dv dx dv 2 dx) dx 

3 2 0 _ 30 d~u /3f0 du 3 2 0 dv\ du 30 dh , / 3 2 0 du , 3^0 dv\ dv 

dy 2 du~dy 2 \du 2 dy dvdu dy/dy dv dy 2 \du dv dy ~dv 2 dy)dy 

When these are added, we obtain 

3^0 4. = _i_ , 3V f /3 m V , /3mY] 

3X’ 2 dy 2 du \3x 2 dy 2 / du 2 L\3*/ \3y/ J 

_l_o d 2 0 /3 m 3i> | 3 m 3iA , 30 / dh> , 3 2 zA . 3^0 J" / dv\ 2 . / 3A 2 ] 

du dv \3* 3a; dy dy) dv\dx 2 ^ r dy 2 ) 3y 2 [_\3:iy \dyj J 

Since w — u + iv is analytic, by hypothesis, u and v themselves satisfy Laplace’s 
equation. Hence, the first and fourth groups of terms on the right vanish iden- 
tically. Moreover, u and v also satisfy the Cauchy-Riematm equations; hence the 
third group of terms also vanishes identically. Using the Cauchy-Riemann equa- 
tions again, what remains can be written 

&± , 3^0 _ 3V r / 3 m Y , /_ M 2 1 4- ^ r V 4- /M 2 1 

dx 2 " 1 ~ dy 2 du 2 LVS.'C/ ^ \ dx) J " r dv 2 L\3m/ ^ \dx) J 

-[(sH^'KS+S) 

Thus, at any point where the transformation is conformal, that is, where/' ( 2 ) 5 ^ 0, 


3^0 

dx 2 


3 2 0 

"dy 2 '' 


implies 


3V , 3^0 __ 
du 2 ^ dv 2 


as asserted. 


Suppose now that it is required to solve Laplace’s equation, 
subject to certain boundary conditions, within a region R. Unless 
J? is of a very simple shape, a direct attack upon the problem will 
usually be exceedingly difficult. However, it may be possible to 
find a conformal transformation which will convert R into some 
simpler region R 1 , such as a circle or a half plane, in which 
Laplace’s equation can be solved, subject, of course, to the trans- 
formed boundary conditions. If this is the ease, the resulting 


SEC. 17.3 


THE BILINEAR TRANSFORMATION 


737 


solution, when carried back to R by the inverse transformation, 
will be the required solution of the original problem. 

EXERCISES 

1 a What is the length of the curve into which the upper half of the circle \z\ = a is trans- 
formed by the function w -- 1 /£? (b) What is the length of the arc into which this function 
transforms the segment of y ~ 1 — a: which lies in the first quadrant? 

2 What is the area of the region into which the square with vertices z = 0, 1, 1 + i, i is trans- 
formed (a) by the function w « (b) by w — z 3 ? 

3 a What are the critical points of the transformation w = 3 z — z 3 ? (b) What is the locus of 
points at which the magnification is 1? What is the locus of points at which infinitesimal 
segments are rotated (c) through 45°? (d) through 90°? 

4 Are there any points at which infinitesimal segments are left unchanged in magnitude and 
direction by the transformation w — z 2 -j- z 3 ? 

5 If il = 2sc a + y" and v — j/ s /.t, show that the curves u = constant and v = constant cut 
orthogonally at all intersections, but that the transformation defined by /(z) = u + iv is 
not conformal. Give a specific illustration of the latter fact. 


17.3 

The bilinear transformation 


( 1 ) 


The simplest class of conformal transformations, yet one of the 
most important, is the class of bilinear or linear fractional or 
Mobius transformations,* defined by the family of functions 


V) 


nz -j- b 
cz 4* d 


ad — be 0 


The restriction ad — be s^-0 is necessary because, if ad = be, then 
a/c — h/d and the numerator and denominator of w are propor- 
tional. As a consequence, w is a constant independent of z, and 
thus the entire z-plane is mapped into the same point in the 
w-plane! 

It is convenient to investigate the general bilinear trans- 
formation by considering first the three special cases 


a w ~ s + X 

b w “ (j.z 

C W — 1 /z 

In case a, w is found by adding a constant vector X to each z. 
Hence the transformation is just a translation in the direction 
defined by arg X through a distance equal to |X|. In particular, we 
note for later use that this rigid motion necessarily transforms 
circles into circles. 

In case b, w is found by rotating each z through a fixed angle 
equal to arg y and then multiplying its length by the factor |ju|. 


Named for the German geometer A. F. Mobius (1790-1868). 



738 


CONFORMAL MAPPING 


CHAP. 17 


In this case, too, circles are transformed into circles. To prove 
this, let us first write the equation of the general circle 

a(x 2 -f y 2 ) + bx + cy + d — 0 a, b, c, d real & 2 + c 2 ;> 4 ad 
in terms of z and l by means of the relations 

Z-j-Z Z Z o I a 

x = — y ~~w * + y = 2 * 

The result is <Jz2 + - z + ~ I + d = 0 

or, renaming the coefficients, 

(2) ( A + A)zz + Bz + Bz + (D + D) =0 

where now A, B, and D can be arbitrary complex numbers, sub- 
ject to the condition BB ^ (A + A)(D + D), derived from the 
condition 6 2 + c* 2: 4ad, which ensures that the radius of the 
circle is real. If the substitution 
w 

z — — 

M 

is made in (2), we obtain the transformed equation 

(A + A)~t + B- + + {D + D) =0 

up n P 
or 

(3) (A + A)wtii + ( Bp)w + (&n)u) + (D + D)nP = 0 

Since the coefficients of the first and last terms in (3) are real and 
since the coefficients of to and w are conjugates, this equation has 
the same structure as (2) and, hence, will also represent a circle 
provided its coefficients satisfy the condition necessary for the 
radius to be real. For (3), this condition is 

(5m)(£ m ) (A A A) (D + D)y.p 

or, dividing through by up, which is necessarily positive, 

BB Z (A + A)(D -f- D) 

which is true by hypothesis. If a — 0, so that A + A — 0, both 
the given circle and its image reduce to straight lines. 

In case c we can write 



which shows that w is of length l/|z| and has the direction of I. 

To describe the geometrical process by which a point with 
these characteristics can be obtained from a given point z, we 
must first define the process of inversion. Let C be a circle with 
center 0 and radius r, and let P be any point in the plane of C. 
Then the inverse of P with respect to C is the point P' on the 


SEC. 17.3 


THE BILINEAR TRANSFORMATION 


739 


ray OP for which 
(5) OP • OP' = r 2 

From the symmetry of this relation it is clear that P is also the 
inverse of P' . Geometrically, a point and its inverse are related as 
follows: From any point P outside a circle C with center 0, let 
the two tangents to C be drawn, and let the points of contact of 
these tangents be joined (Fig. 17.4). The intersection of this chord 


FIGURE 17.4 
Plot showing the 
geometrical rela- 
tion between a 
point and its 
inverse. 



with the line OP is the inverse P' of P. Conversely, let P' be any 
point in the interior of C. At P' erect a perpendicular to OP', and 
at the point where this meets C let the tangent to C be drawn. 
The intersection of this tangent and the line OP' is the inverse 
P of P'. The consistency of these constructions with the definiti ve 
property (5) is evident, since in Fig. 17.4 


and thus 


AOP'Tt ~ AOTtP 

op' or, 

0T\ OP 


or OP ■ OP' = (OT 0 s = r 2 


It is evident now that the construction of w from z in case c 
requires that the inverse of z in the unit circle be found and then 
reflected in the real axis; for the first of these steps gives a com- 
plex number whose length is l/\z\, and the second achieves the 
direction of z, as required by (4). 

To show that circles are also transformed into circles in case 
c, let the substitution z — l/w be made in the self-conjugate form 
of the equation of a circle (2). This gives 

(A + A)-l + - +1 + (P + D) = 0 

W W W V) 

or ( D + f))ww + Bw + Bid + (A + A) — 0 

which is also the equation of a circle with real radius. If A + A =0, 
the original circle reduces to a straight line whose image is a circle 
passing through the origin, since its equation contains no constant 
term; Conversely, any circle passing through the origin is trans- 
formed into a straight line. 

The three special transformations we have just considered 
can be used to synthesize the general bilinear transformation. To 



740 


CONFORMAL MAPPING 


CHAP. 17 


| 

f 


see this, suppose first that c?^0. Then the general transformation 
is equivalent to the following chain of special transformations: 

,d 

Wl ~ z -f - 
c 

cwi = cz d 
1_ = 1 
wz cz + d 
be — ad _ be — ad 
c W[i ~ c(cz + d) 

a _ be — ad a _ az -j- b 

Wi c~~ c{cz -f d) c cz -f- ^ 

On the other hand, if c — 0, it is clear from the restriction 
ad — be 0 that neither a nor d can be zero. Hence, we can write 



Thus we have shown that in all cases the general bilinear trans- 
formation can be compounded from a succession of simple trans- 
formations of types a, b, and c. Since each of these is known to 
transform circles into circles, including straight lines as special 
cases, we have thus established the following theorem : 


W2 — 

103 — 

104 = 

to = 


THEOREM 1 

Under the general bilinear transformation circles are transformed into circles. 


The general bilinear transformation 


10 = 


az b 
cz + d 


depends on three essential constants, namely, the ratios of any 
three of the constants o, b, c, d to the fourth. Hence it is evident 
that three conditions are necessary to determine a bilinear trans- 
formation. In particular, the requirement that three distinct 
values of z, say zi,zs, Zz, have specified distinct images io X) io 2 , u<z 
leads to a unique transformation. 

Although the transformation which sends three given points 
into three specified image points can be found by imposing these 
conditions on the general equation and solving for the constants, 
it is generally simpler to make use of the fact that if w h w 2 , w : >, w 4 
are, respectively, the images of z\, z 2 , z 3 , z 4 , then 

(Wi — U'i) (Wz — Wj) (Zi — Z 2 ) (Zs — Zi) 

(Wi - Wi)(Wz - w 7 } (zi - Zi)(Zz - z 2 ) 

To establish this relation, we observe that 

w . - w = aZi + b _ ag J + b = ( ad ~ bc ^ Zi ~ g i) 

1 ezi + d cz, + d ( cz* + d)(czj d) 


SEC. 17.3 


THE BILINEAR TRANSFORMATION 


741 


Hence 


(ad — 6c) (z x — z 2 ) . (ad — bc)(zs — z 4 ) 

(w i — wa)(w 3 — w.\) _ (cz\ + d)(czj + d) (cz 3 + d)(c z\ + d) 

(wi — Wi)(w-i — Wz) (ad — bc)(zi — z.\) _ (ad — 5c) (z% — g 2 ) 

(c 2 i + d)(cs 4 + d) (c 2 3 -f d)(cz % + d) 

_ (zi ~ 2 ?.) Q:i — z 4 ) 

(«x — «4)(«I - 2a) 

The last fraction is called the cross ratio or anharmonic ratio of 
the four numbers z h za, z 3 , z 4 ; hence the result we have just estab- 
lished can be formulated as the following theorem: 


THEOREM 2 

The cross ratio of four points is invariant under a bilinear transformation. 

Suppose now that it is required to find the transformation 
which sends Zi, Zs, z :i into w i, Wa, Wa, respectively. If w is the image 
of a general point z under this transformation, then, according to 
Theorem 2, the cross ratio of W\, Wa, Wz, and w must equal the 
cross ratio of z\, Za, g 3 , and z. That is, 

( wi - Wj)(wz — w) __ (z\ — Za) (g 3 — z) 

(t»i — w)(i»3 — Wa) (zi — z)(za — Za) 

This equation is clearly bilinear in w and z and is satisfied by the 
three pairs of values (zi,wi), (za,wa), (za,w 3 )- Moreover, everything 
in it is known except the variables w and z themselves; hence it is 
necessary only to solve for w in terms of z to obtain the required 
transformation in standard form. 

EXAMPLE 1 

What is the bilinear transformation which sends the points z = — 1, 0, 1 into the points w — 0, 
i, Si, respectively? 

Setting up the appropriate cross ratios, we have 

(0 - i)(3t - to) _ (-1 - Q)(l - a) 

(0 - w)(Si - i) (-l-z)(l-0) 

Si — to __ 1 — 2 
° r ~~2uT ~ 1 + z 

Solving for w, we obtain without difficulty 


EXAMPLE 2 

What is the most general bilinear transformation which maps the upper half of the z-plane onto 
the interior of the unit circle in the ic-plane (Fig. 17.5)? 

Let the required transformation be 


az + b 



CONFORMAL MAPPING 


FIGURE 17.5 
The upper half 
of the 2 -plane to 
be mapped onto 
the interior of 
the unit circle in 
the w>-plane. 


Since the boundaries of corresponding regions must correspond under any transformation, the 
unit circle in the w-plane must be the image of the real axis in the z-plane. Therefore, for all real 
values of z, we must have 



| | « l az + 5 I _ W . k 4- (h/a)\ _ 
W \cz + d\ |c| |z + (d/e)\ 
In particular, from the limiting case |z| w , we find 


and thus for real values of z 


The last equation expresses the fact that the complex numbers — b/a and —d/c are equally far 
*rom all points on the real axis, which is possible if and only if the real axis is the perpendicular 
Disector of the segment joining the points —b/a and ~d/c. Therefore, —b/a and —d/c must be 
conjugates, say X and X. Thus we can write 


where the last step follows because, as we found earlier, a/c is a complex number of absolute 
value 1. 

So far we have enforced only the condition that the boundaries of the two regions correspond. 
It is now necessary to make sure that the regions themselves correspond as required and that the 
upper half of the z-plane has not been mapped into the outside of the circle |ia| =1. This is most 
easily verified by checking some convenient point, say z = X. This maps into w — 0, which is 
certainly inside the circle |w| — 1. Thus, if X is restricted to lie in the upper half of the z-plane, 
the solution is complete. 

As a special case of some interest, let e*' s = — 1, and let X be a pure imaginary, say i. Then 


2 i \z + i z - i) 


Now 


SEC. 17.3 


THE BIUNEAR TRANSFORMATION 


743 


or, reducing to a common denominator and simplifying, 


4(w) = 


8 + Z 

(z + i)(S - i) 


The denominator of the last fraction is the product of z + i and its conjugate z — i and, hence, is 
a positive quantity. Thus, the imaginary part of w will be positive if and only if z -f- 2 is positive. 
Since z + z is equal to twice the real part of z, this shows that the transformation (7) not only 
maps the upper half of the z-plane onto the unit circle |«>| g 1 but does it in such a way that the 
first quadrant of the z-plane [where (R(z) > 0] corresponds to the upper half of the circle [where 
3(w) > 0] and the second quadrant of the z-plane corresponds to the lower half of the circle. 
In the opposite direction, the inverse transformation 


( 8 ) 


maps the interior of the circle \w\ = 1 onto the upper half of the z-plane in such a way that the 
upper half of the circle maps onto the first quadrant of the z-plane. 


EXAMPLE 3 

Find a transformation which will map an infinite sector of angle 7r/4 onto the interior of the unit 
circle. 

Since the boundary of the sector consists of portions of two straight lines, yet its image is to 
be a single circle, it is apparent that the mapping cannot be accomplished by a bilinear transfor- 
mation alone. However, a simple combination of a power function and a linear fractional func- 
tion will define a suitable transformation. Specifically, the transformation 
t - z 4 


will open out, the sector in the z-plane into the upper half of the auxiliary f-plane (Fig. 17.6). 


FIGURE 17.6 
The two trans- 
formations 
needed t,o map 
an infinite sector 
onto the interior 
of the unit circle. 



Following this, the upper half of the <-plane can be mapped onto the unit circle in the w-plane by 
any transformation of the family (6), which we obtained in the last example, say 


t - i 



Combining these two, we have for the required transformation 
z 4 — i 

W = 


EXAMPLE 4 

Find a transformation which will map a 60° sector of the unit circle in the z-plane onto the upper 
half of the tc-plane. 

At first glance it would seem that this problem can be solved simply by opening the given 
sector into a full circle by the transformation 




t — z 6 



744 


CONFORMAL MAPPING 


CHAP. 17 


FIGURE 17.7 
A circular sector 
“opened out” 
into a circular 
region cut along 
a radius. 



and then mapping the circle from the <-plane onto the upper half of the us-plane by means of t he 
inverse of one of the transformations of the family (6) which we obtained in Example 2, for in- 
stance, the transformation (8). This method fails, however, because the circular region obtained 
in the f-plane in this case is not of the type considered in Example 2. The latter consisted of a sim- 
ple circular boundary plus its interior, whereas the former consists of the interior of a circle “cut” 
along a radius, since the radius O' A’ = O'B' is actually the image of the two boundary radii OA 
and OB (Fig. 17.7). 

To avoid this difficulty, let us first map the sector onto a semicircle by the transformation 
ti = z 3 

Then let us map the semicircle from the ih-plane onto the first quadrant of the fa-plane by means 
of the transformation (8) 


Finally (Fig. 17.8) let us open out the first quadrant of the < 2 -plane into the upper half of the 
w-plane by the transformation 

w - f 2 » 

Combining these three transformations, we find 



as the required solution. 


z plane t x plane t 2 plane w plane 


B 2 at»= 


B 




A A 

A 

B' at«x> 


/A ( 

\ 



o A J3 1 

Oi 

A 2 0 ' 

A' 


FIGURE 17.8 

The sequence of transformations necessary to map a circular sector onto a half plane. 

EXAMPLE 5 

A thin sheet of metal coincides with the first quadrant of the z-plane. The upper and lower faces 
of the sheet are perfectly insulated against the flow of heat. Find the steady-state temperature 
at any point of the sheet if the boundary temperatures are those shown in Fig. 17.9a. 


SEC. 17.3 


THE BILINEAR TRANSFORMATION 


745 



(a) ( b ) 

FIGURE 17.9 

An infinite 90° sector mapped, with its boundary conditions, onto a half plane. 


Under the assumptions of the problem, the flow of heat in the sheet is two-dimensional, and 
we must accordingly solve Laplace’s equation, i.e., the two-dimensional steady-state heat equa- 
tion derived in Sec. 8.2, 

d*T d*T 

— — -j ~ 0 

ax“- dy- 

subject to the given conditions along the boundaries of the first quadrant. To do this, it is con- 
venient to map the first quadrant of the z-plane onto the upper half of the w-plane by the 
transformation 

«i = z 2 — (x 2 — j/ 2 ) + 2 ixy 

This reduces the problem to that of finding a solution of Laplace’s equation in the upper half 
plane which assumes along the real axis the boundary conditions shown in Fig. 17.96. 

Now we have long since discovered (Property 1, Sec. 14.6) that either the real or the 
imaginary part of any analytic function satisfies Laplace’s equation. In particular, since the 
function 

(9) iT n + - [(Ti - To) In (z - xo) + <T, - Ti) In (z - x x ) + • • • 

4- (T n+l - T n ) In (z - x n )] 

is analytic except at the real points .To, Xi, , x„, its imaginary part, namely, 

(10) T = To + - [(T\ - To) arg (z - x„) + (To - Ti) arg (z - Xi) + • • • 

+ (T„ +1 - T n ) arg (z - x„)l 

will be a solution of Laplace’s equation. Moreover, along the real axis, this solution takes on the 
boundary values shown in Fig. 17.10. To see this, we observe from Fig. 17.10 that the complex 



j' 


I 




FIGURE 17.10 

Plot showing the behavior of arg (z — x ( ) as z varies along the real axis. 



746 


CONFORMAL MAPPING 


CHAP. 17 


number z — xt is represented by the vector joining the fixed point x; to the variable point z, and 
thus arg ( z — xo is simply the inclination angle of this vector. Hence the function (10) can be 
rewritten 


(ID 


T = To + - l(Ti - T a )0 o + ( T, ~ T08i + • 


+ (T n+ x - Tn)0n] 


Again referring to Fig. 17.10, it is dear that, for all values of z on the real axis to the right of x 0 , 
each of the 0’s is zero. Hence from (11) we see that T reduces to the constant value T o along 
this portion of the real axis. Furthermore, when z lies between xi and xo, 0o is equal to r, but all 
the other 0’s are still zero. Hence, along this segment the temperature (10), or (11), reduces to 

T = To +-[(Ti - ?«)»]'■« Ti 

7 r 

Similarly, for values of z between xo and xi, the angles 0 O and 0i are each equal to %, but all other 
0’s are zero. Hence, along this segment, we have 

T = To + ~ ((Ti - T 0 )ir + (T t ~ TM - To 

Continuing in this fashion, we can verify that T, as defined by (10) or (11), not only is a solution 
of Laplace’s equation, being the imaginary part of the analytic function (9), but also assumes 
along the real axis the temperature distribution shown in Fig. 17.10. 

Specializing these observations to our problem, it appears that the solution we require is 

T = 100 + - [(0 - 100)00 + (100 - O)0i] 

= 100 + ~ (0i - 0 O ) 


= — k + (0i — 0 O )I 

Now, multiplying by x/100 and then taking the tangent of both sides of the last equation, 
we have 


1 100 = 


tan k + (0i - 0 O )] = tan (0 1 - 0 O ) 
tan 0i — tan 0 O 


1 + tan 0 O tan 0i 

Substituting for tan 0 O and tan 0i their values as read from Fig. 17.96, we obtain from the last 
expression 

_ v/(u + 1) - v/(u - 4) 

100 1 + »V(« + 1)(m — 4) 

(.2) - WV-fV -l 

U* T »■ - du — 4 

which is the solution of the transformed problem in the w-plane. Returning to the z-plane by 
means of the transformation equations 

u = x 2 — y 2 and v — 2 xy 

we thus find, from (12), that 

' 100 -10xi/ 


X (x 2 + I/ 2 ) 2 — 3x 2 + 3y 2 — 4 

is the solution to the original problem. 


SEC. 17.3 


THE BILINEAR TRANSFORMATION 


747 


EXERCISES 

1 What is the cross ratio of the four fourth roots of —1? 

2 What is the cross ratio of the four complex sixth roots of 3 ? 

8 Show that in general there are two points which are left invariant by a bilinear transforma- 
tion. Are there any bilinear transformations which leave only one point invariant? no points 
invariant? 

4 Find the invariant points of the transformation ■«> = —(2s + 4i)/(iz + 1), and prove that 
these two points, together with any point z and its image w, form a set of four points having 
a constant cross ratio. 

6 What is the bilinear transformation which sends the points z — 0, — 1, « into the points 

w — — 1, — 2 — i, i, respectively? What is the image of the circle |z| = 1 under this 

transformation? 

6 What is the bilinear transformation which sends the points z = 0, —i, 2 i into the points 
io = 5 i, so, —if 3, respectively? What are the invariant points of this transformation? 

7 What is the most general bilinear transformation which maps the upper half of the 2 -plane 
onto the lower half of the tit-plane? 

8 Prove that w = zf\ 1 — z) maps the upper half of the 2 -plane into the upper half of the 
■wj-plane. What is the image of the circle |z| = 3 under this transformation? 

9 Find a transformation which will map an infinite sector of angle »/3 onto the interior of the 
unit circle. 

10 Show that, along the circle |ca + d\ - \/\ ad - be\ the transformation w « ( az + b)/(ez + d) 
does not alter the lengths of infinitesimal segments. What happens to segments inside this 
circle? outside this circle? What is the locus of points where infinitesimal segments are not 
rotated by the transformation? 

11 Find a transformation which will map a 45° sector of the unit circle in the s-plane onto the 
upper half of the w-plane. 

12 Find a transformation which will map the upper half of the unit circle onto the entire unit 
circle. 

13 Show that, if |c[ = \d\, then the transformation w = (az + b)/(cz + d) maps the unit circle 
in the 2 -plane into a straight line in the w-plane. 

14 Verify that the transformation w = re ia (z — zt)/(z — z-i) maps the region in the 2 -plane 
bounded by two circular arcs intersecting at an angle « at z\ and z« into the interior of an 
angle a in standard position in the wi-plane. [Hint: Recall that an equation of the form (2) 
represents a circle in the 2 -plane.] 

16 Prove that four points z\, z», 23 , 24 lie on a circle if and only if their cross ratio is real. 

16 Find the steady-state temperature distribution in a sheet of metal coinciding with the first 
quadrant of the 2 -plane if T = 100° along the positive :c-axis and T — 0° along the positive 
y- axis. 

17 Find the steady-state temperature distribution in a sheet of metal coinciding with the interior 
of a 60 ° angle in standard position in the 2 -plane if T = 0° along the horizontal side of 
the angle and T = 100° along the other side. 

18 Find the steady-state temperature distribution in a sheet of metal coinciding with the first 
quadrant of the 2 -plane if T = 100° along the positive y- axis, if T = 50° between 0 and 3 on 
the z-axis, and if T = 0 to the right of 3 on the z-axis. 

19 Find the steady-state temperature distribution in the unit circle in the z-plane if the upper 
half of the boundary of the cirele is kept at the temperature T = 100°-and the lower half of 
the boundary is kept at the temperature T = 0°. 

20 Show that w = z + 1/z maps the portion of the upper half of the 2 -plane exterior to the 
circle |z| = 1 onto the entire upper half of the to-plane. Use this result to find the steady- 
state temperature distribution in the upper half of the 2 -plane exterior to the unit circle if 
T — 100 ° along the linear portion of the boundary and T = 0° along the circular portion of 
the boundary. 



CONFORMAL MAPPING 


17.4 

The Schworz-Christoffel transformation 

In general, the conformal transformation of one given region into 
another is exceedingly difficult. The existence of such a transforma- 
tion is assured by the following theoi'em, due to Riemann: 

THEOREM 1 

Any two bounded simply connected regions can be mapped conformally onto each 

other. 

However, the determination of the specific function which ac- 
complishes a required mapping is usually out of the question. In 
fact, in addition to the simple regions which we found could be 
mapped by means of the elementary functions, the only class of 
regions for which conformal transformations of practical interest 
exist are those bounded by polygons having a finite number of 
vertices (one or more of which may lie at infinity). These can 
always be mapped onto a half plane and, hence, onto any region 
into which a half plane can be transformed, by means of a trans- 
formation which we shall now discuss. 

To see how this can be done, we first recall the mapping 
properties of the power function 


( 1 ) 


Since this transformation has the property (Theorem 3, Sec. 17.2) 
that it alters by the factor m any angle with vertex at the origin, 
it follows that the transformation 

10 — Wi ~ (z — Xi) a ' lv 

will take a segment of the z-axis containing x% in its interior, i.e., 
.a straight angle with vertex at xi, and “fold" it into an angle of 


( 2 ) 


with vertex at Wi. Clearly, if this could be done simultaneously 
for a number of points Xi, x 2 , . . . , x„ on the a;-axis, the a;- axis 
would be mapped into a polygon whose angles were, respectively, 
a x , a 2 , . . . , ct n and conversely, and the biggest step in the solu- 
tion of our problem would be taken. This is actually possible, and 
the transformation which accomplishes it, suggested by the form 
of the derivative of (1), is defined by 

dw 


dz 


~ K{z — $ x )<«t/*)-l(2 — Xs) (atlr) - 


(; Z — Xn) (a ’ ,f * ) ~ 


To verify this, we begin with a point z on the r-axis to the left 
of the first of the given points x\, x 2 , . . . , x n and investigate the 
locus of its image as it moves to the right along the a;-axis (Fig. 


SEC, 17.4 


THE SCHWARZ-CHR1STOFFE8. TRANSFORMATION 


749 


(a) 

FIGURE 17.11 

The mapping of the real axi 



n the z-plane into a polygon with prescribed angles in the u,’-plane. 


17.11). From (2) we obtain at once the relation 
(3) arg dw = arg K + — 1^ arg (2 — X\) 

+ (? ” 1) arg O 2 ~ *•) + ■ ■ ■ 

+ (? “ l) arg ^ ~ *») + ar s ds 

and in this it is apparent that until 2 reaches Xi, every term on the 
right remains constant, since z — xi, z — x 2 , . . . ,z — x n are all 

negative real numbers and, hence, have ir for their respective 

arguments, and since dz is positive and, therefore, has 0 as its 
argument. Thus the image point w traces a straight line, since the 
argument of the increment dw remains constant. However, as z 
passes through Xi, the difference z — Xi changes abruptly from 
negative to positive, and thus arg (2 — Xi) decreases abruptly 
from 7T to 0. Hence, arg dw changes by the amount 


(5! 


But, from Fig. 17.116, it is evident that this is the precise amount 
through which it is necessary to turn if w is to begin to move in the 
direction of the next side of the polygon. As z moves from x 1 to x 2 , 
the same situation exists. The argument of dw remains constant, 
and thus w moves in a straight line until 2 reaches x 2 . Here z — x 2 
changes abruptly from negative to positive, arg (2 — £ 2 ) jumps 
from w to 0, and, as a consequence, arg dw increases by the amount 
tv — « 2 , which is the exact amount of rotation required to give the 
direction of the next side of the polygon. 

Thus as 2 traverses the z-axis, it is clear that w moves along 
the boundary of a polygon whose interior angles are precisely the 
given angles a h a 2 , . - . , a„. Moreover, it is evident that the 
region which is mapped onto the half plane is the region which 
contains these angles. The required transformation will be ob- 



750 


CONFORMAL MAPPING 


CHAP. 17 


tamed if we can ensure that the lengths of the sides of the polygon, 
as well as its angles, have the correct values. 

Now 7 the mapping function w, obtained by integrating (2), is 

(4) w = Kj[(z - xiY a ' lr) ~ l (z - ■ ■ ■ (z~ x n )< a nM-i] d z + c 

and this can be thought of as the result of the two transformations 

(5) t = /[(2 - - x 2 y a ^~ i • • • (z - x n y«»^- 1 ] dz 

(6) w = Kt + C 

The first of these transforms the x-axis into some polygon which 
the second then translates, rotates, and either stretches or shrinks, 
as the case may be. If, then, the polygon determined by (5) is 
similar to the given polygon, the constants in (6) can always be 
determined so as to make the two polygons coincide. 

Now for two polygons to be similar, not only must cor- 
responding angles be equal but corresponding sides must be pro- 
portional. For triangles this is automatically the case. For quadri- 
laterals one further condition is required, namely, that two pairs 
of corresponding sides have the same ratio. For pentagons two 
such conditions are required, and, in general, for a polygon of 
n sides, n — 3 conditions, over and above the equality of cor- 
responding angles, are necessary for similarity. Hence, in map- 
ping a polygon of n sides onto a half plane, three of the image 
points xi, x 2 , , x„ can be assigned arbitrarily, following 

which the remaining n — 3 are determined by the conditions of 
similarity. In many important problems, a vertex of the polygon, 
usually an infinite one, will correspond to z — °q . In this case 
dw/dz contains one less term than usual and, hence, one less 
parameter. Therefore, only two of the n — 1 finite image points 
xi, x 2 , ... , x n -i can be specified arbitrarily. In either case the 
resulting transformation is known as the Schwarz -Christoff el 
transformation.* Obviously, since w is analytic everywhere, 
except possibly at the points Xi, x 2 , . . . , x ni the transformation 
is conformal. In practice, the usefulness of the Schwarz-Christoffel 
transformation is often limited by the complexity of the integral 
which defines the mapping function w. 

EXAMPLE 1 

Find the transformation which maps the semi-infinite strip shown in Fig. 17.12a onto the half 

plane, as indicated. 

The required transformation is defined by 

dw ^-l ^ 2 -l -1 -1 

— - K(z + 1) - ( 2 - 1) ' = K{z + 1) 2 (z - 1) 2 

Hence, w = K f = K cosh~i z + C 

J V-* 2 ~ 1 


* Named for the German mathematicians H. A. Schwarz (1843-1921) and 
E. B. Chrisioffel (1829-1900), who discovered it independently about 1865. 


SEC. 17.4 


THE SCHWARZ-CHRISTOFFEL TRANSFORMATION 


751 


w plane z plane 



v 

\y 

FIGURE 17.12 A 



A semi-infinite 

| 


strip to be n 



mapped onto a 

A' 

0' 

half plane. “o' 

u -l 

i * 


(a) (b) 


Since w — 0 is to correspond to 2 = 1, we have 

0 = K cosh" 1 1 + C or C = 0 
Also, w — in is to correspond to z = — 1, and thus 

in = k cosh- 1 (-1) = K(in) or K = 1 


The required transformation is, therefore, w = cosh -1 s, or 
z = cosh v> 

Broken down into real and imaginary parts, this becomes 
x + iy = cosh u cos v -f i sinh u sin v 
or x = cosh u cos v 

y — sinh u sin v 

Eliminating u and v in turn, we have also 


x 8 

cosh 2 

x 2 

cos 2 » 

which, if necessary, can 


u sinh 2 u ^ 

r~~ ■ 1 

i sm 2 v 

be solved for u and 


in terms of 


and y. 


EXAMPLE 2 

Find the transformation which maps the infinite region shown in Fig. 17.13a onto the upper half 
plane as indicated. 

With images assigned as shown and with the angle at the finite vertex A identified asai = 2v 
and the angle at the infinite vertex B identified as a 2 = 0, we have 

~ = K(z + i)(w*)-i3»/*)-i = A A + 0 


FIGURE 17.13 
A semi-infinite 
channel to be 
mapped onto a 
half plane. 



(a) 


plane 


z plane 

(b) 



752 


CONFORMAL MAPPING 


CHAP. 17 


and 

(7) w = K(z + In z) + C 

To determine the constants K and C, we write (7) in the form 

ii + id - ( Ki + iKi)(x + iy + In |z| + i arg z) + C\ + iCi 
from which, by equating imaginary parts, we obtain 

(8) v = Kiy -f Kix + Ki In |z| + Ki arg z + <7; 

Now, when w becomes infinite along AB, on which v = x, the image point z approaches zero along 
the negative real axis, on which y = 0 and arg z — ir. Hence, from (8), 

t - lim (Ki • 0 + KiX + Ki In |z| + Kvc + C t ) 

z-* 0~ 

Obviously K 2 must be zero to keep In izj from making the right member infinite. Hence, 

(9) 7T = Kir + C a 

Also, as w becomes infinite along OB, on which v — 0, the image point z approaches zero along 
the positive real axis, on which y = 0 and arg z — 0. Hence, using (8) again, we have 

0 - lim (Ki ■ 0 + C 2 ) = Ci 
*->0+ 

Therefore, CV = 0, and so from (9) we find that Ki — 1. Thus (7) reduces to 
w = z + In z + Ci 

Finally, the point w = iir must map into the point z = — 1. Hence, from the last equation, 
ix ~ — 1 + In (— ’ 1) + Ci 
- -1 +Ci 

and so C\ must equal 1. The required mapping function is, therefore, 
w = z + In z + 1 

Figure 17.14 shows the curves in the tc-plane which correspond to the lines x = 0, Yi, 1 and 
the lines y — v/4, x/2, 3t/ 4. The resulting configuration can be shown to represent either the 
lines of equal velocity potential and the streamlines for the flow of an ideal incompressible 
fluid from an infinite straight channel or the lines of flux and the equipotential lines for a 
parallel-plate condenser. 




SEC. 1 7.4 


THE SCHWARZ-CHRISTOFFEl TRANSFORMATION 


753 


EXERCISES 

1 Find the transformation which will map the region shown in Fig. 17.15a onto the upper half 
plane, as indicated. 


FIGURE 17.15 



2 Using the results of Exercise 1, find the steady-state temperature distribution in the w-plane 
if the upper side of the negative w-axis is kept at the temperature T = 100° and the lower 
side of the negative w-axis is kept at the temperature T = 0°. 

3 Find the transformation which will map the exterior of the first quadrant in the w-plane 
into the upper half of the 3-plane, as indicated in Fig. 17.16. 


A 

FIGURE 17.16 

1 -1 

V 

' -i/. V’- 


OQ 

o 

0' A' x 


(a) 


GO 


4 Using the results of Exercise 3, find the equations of the isothermal curves in the exterior of 
the first quadrant of the w-plane if the positive w-axis is kept at the temperature T = 100° 
and the positive half of the a-axis is kept at the temperature T — 0°. 

5 Find the transformation which will map the region shown in Fig. 17. 17a onto the upper half 
plane, as indicated. 


FIGURE 17.17 



V 

y 


B 



1 


A 

c 4' 

B’ C’ 


I “ -i 

1 * 


(a) (b) 

6 Using the results of Exercise 5, find the steady-state temperature distribution in the upper 
half of the w-plane if the w-axis is kept at the temperature T = 0° and the segment of the 
r-axis between 0 and i is kept at the temperature T — 100°. 


754 


CONFORMAL MAPPING 


CHAP. 17 


7 By first mapping into the 2 -plane as indicated, find the steady-state temperature at any 
point in the region shown in Fig. 17.18. 


FIGURE 17.18 





7 

A 





TT 


tr=i 





T=0 


T-'l o' T: ? 0 





0 

\ “ 

-i 

i x 


(a) (b) 

Find the transformation which will map the region shown in Fig. 17.19a onto the upper 
half plane, as indicated. 



y 


B' 

A‘,; : "■■■■■< 


-1 

X 


(a) 


(b) 


Find the transformation which will map the region shown in Fig. 17.20a onto the upper 
half plane, as indicated. 



v 

y 

T I 
2 1 

c — - 


FIGURE 17.20 




TH 

A' 

B' 9' 


u _i 

1 x 

(a) 

(6) 



10 Find the transformation which will map the interior of the infinite strip 
0 £ 4(w) g s r 

onto the upper half of the 2 -plane. (Hint: Consider the strip as the limiting form of the 
quadrilateral shown in Fig. 17.21, as Wi and w 3 become infinite, and let wi, w 3l and w 3 corre- 
spond, respectively, to z — 0, 1, and oo, with the image of w A to be determined.) 



APPENDIX 


Appendix 


AJ 

Graeffe’s roof-squaring process 

At various points in our work, notably in the solution of systems 
of simultaneous differential equations and the determination of 
the inverse Laplace transforms of rational fractional functions 
of s, we found it necessary to solve polynomial equations of rela- 
tively high degree. Since exact formulas for the roots of a poly- 
nomial equation P n (x) = 0 exist only when n ^ 4 and since, when 
n = 3 and n = 4, these are complicated and awkward to use, it 
is clear that the roots of P n {x) — 0 will usually have to be found 
by some process of numerical approximation. Numerous pro- 
cedures are available for this purpose, including Newton’s method 
and the method of interpolation. However, in many respects the 
best method is what is known as Graeffe’s root-squaring process,* 
which has the desirable feature that it yields all the roots, both 
real and complex, at essentially the same time. To present the 
theory of this method, let us assume first that the roots of the 
given equation 

(1) x n + aix n ~ l + aix n ~* + a 3 x n ~ s + • • • + a n -ix + a» = 0 

say — r\, — »’a, — ?% . . . , — r„, are all real and distinct and have 
been arranged in decreasing order of absolute value from —r i to 

Now let us rewrite Eq. (1) with all even powers of x on one 
side and all odd powers of x on the other: 
x n + a 2 z n ~~' 2 4- aa n ~ 4 + • • • = — (aix^ 1 + a^x n ~ 3 + asx n ~ 5 + * • 0 
and then square both sides: 


* Named for the Swiss mathematician C. H. Graeflfe (1799-1873), who pub- 
lished this method in 1837 in a paper which won a prize offered by the 
Academy of Sciences of Berlin for a practical method of computing complex 
roots. 


756 


APPENDIX 


x 2n -f a 2 2 x 2n ~ i + ai 2 x 2n ~ s + • • • ai 2 x 2n ~ 2 + ds 2 x 2n ~ 3 + an 2 x 2n ~' 10 + - • * 
+2 a 2 x in ~ 2 + 2a 4 a: 2n ~' 4 + • • • +2aia 3 x 2n ~ i + 2aidsX 2n ~ s + '**■* 

+2a 2 aiX 2n ~ 6 -(- ••• + 2a,3ahX 2n ~ s + • • • 

+ = + 

Collecting terms again on the left, we obtain 
(2) x 2n — (ai 2 — 2a 2 )x 2n-2 -f- (a 2 2 — 2a!a 3 + 2a i )x 2n ~ i 

— (a 3 2 — 2a 2 a,i + 2aia 5 — 2a 6 )x 2n ~* + • • • =0 
Since Eq. (2) was obtained from (1) by squaring, it is evident 
that it will vanish for any value of x for which Eq. (1) vanishes. 
In other words, any root of the original equation is also a root 
of the derived equation (2). Now, in (2) let y — — ar. Then 
x 2n = {—y) n - {~l) n y n 

a;2n-2 = = -(-l)n^n-l 

= (-2/)»- 2 = (-l)V- 2 


and dividing out (—1)”, Eq. (2) becomes 
(3) y n 4- (oi 2 - 2 a 2 )y n ~ l + ( a 2 2 - 2a 1 a 3 + 2a,i)y n ~ 2 

+ (a 3 2 — 2a 2 a,4 + 2aia 5 — 2at)y n ~ 3 + • • * =0 
By virtue of the substitution y — —x 2 , it is evident that the 
roots of (3) are — (— ri) 2 , — • (— r 2 ) 2 , . . . , — (— r„) 2 , or — ri 2 , 
—r 2 2 , . . . , ~r n 2 . We have thus constructed a new equation 
whose roots are numerically equal to the squares of the roots 
of the original equation. Obviously, by repeating this process, 
equations can be obtained whose roots are numerically equal to 
the fourth, eighth, sixteenth, thirty-second, . . . powers of the 
roots of Eq. (1). 

The effect of this root-squaring process is to give equations 
whose roots are more and more widely separated. For instance, 
if two roots of the given equation are in the ratio 5 :4, then their 
128th powers are in the ratio 
5128.4128 or 2.54 X 10 I2 :1 

This is a highly desirable situation, for, as we shall soon see, 
equations whose roots are widely separated can readily be solved 
with considerable accuracy. 

The squaring process which leads to equations whose roots 
are high powers of the roots of a given equation may be carried 
out systematically in tabular form as follows. Write down the 
successive coefficients in the original equation, taking care to 
write zero for the coefficient of any missing term. Then under 
each coefficient write its square and twice all products of coeffi- 
cients symmetrically located on each side of it, the signs of these 
products being alternately negative and positive as coefficients 
farther and farther from the one in question are multiplied. The 


SEC. A. 


GRAEFFE’S ROOT-SQUARING PROCESS 


757 



sum of the terms placed below any coefficient in this array is 
the coefficient of the corresponding term in the new equation. 

Now suppose that by a number of repetitions of this process 
an equation 

( 4 ) Z n + Oil Z n ~ X 4 Ot 2 Z n ~ 3 4 ’ ' ‘ 4 (X n -lZ + a n — 0 

has been obtained whose roots ax*e numerically the mth powers 
of those of the given equation, or specifically —ri m , — r 2 m , — r 3 m , 
. . . , — r n m . From the usual relations between the roots and the 
coefficients of a polynomial equation, it follows that 

«i — — (sum of the roots) = 4 r 2 m 4 r 3 m 4 ' - ' 4 r n m 

a 2 = sum of the roots taken two at a time 

= ri m rt m + ri m r 3 m 4 • * • 4 ?' 2 m r 3 m 4 • • • 4 C-iT#"* 

(5) a 3 — — (sum of the roots taken three at a time) 

_ r^n^m _J_ ri m r jn r/ m 4 • ■ • 4 r 2 m r 3 m ?'4 OT 4 * ‘ * 4 


a n — ( — l) n (product of all the roots) = n m r 2 m r 3 m • • • r n m 

Under the hypothesis that |ri| > |r 2 | > |r 8 | > • • • > |r„| 
and that m is large, say 128 or 256, it follows that r\ m is enor- 
mously larger than ?- 2 m , which in turn is enormously larger than 
r 3 m , and so on. Hence, to a high degree of approximation, 
a\ — r x m 
a z — rx m r 2 m 
a s = rx m r 2 m rs m 

a„_i = ri m rs m Ts m ’ * • 
a n == rx m r 2 m rs m • • • r™_ x r n m 

since in each case the first term is very much larger than any 
or all of the succeeding terms. 

Thus, |ri| can be found simply by extracting the mth root 
of the second coefficient in the final equation (4). Similarly, since 
the ratio of the third coefficient to the second is 


|r 2 | can be found by extracting the mth root of this ratio. In the 
same way, [r»| is the mth root of the ratio of the fourth coefficient 
to the third, and so on. The signs of the roots cannot be deter- 
mined by this procedure, but can easily be inferred from a rough 





758 


APPENDIX 


sketch of the graph of the left member of the original equation 
(1) supplemented by Descartes’s rule of signs. Example 1 will 
make the details clear. 

Just how many root-squaring operations must be performed 
in a given case, i.e,, just how large m should be, cannot be told 
in advance. An adequate working rule is to continue until each 
new coefficient (with certain exceptions in the case of multiple 
roots and complex roots) is essentially the square of the preceding 
one, i.e., until the product terms in the calculation of the new 
coefficients make no appreciable contribution. 

The case of equal roots can be handled without difficulty 
by returning to Eq. (4) and supposing two of the r’s to be equal, 
say r 3 = n. Then when all but the dominant terms in each 
coefficient are rejected, we find from (5) that 
ai = r\ m 

oti ~ 

az — r\ m r 2 m r% m + — 2ri m r 2 m r 3 m 

ai — ri m r 2 n r z m r i m — n m r 2 m r 3 Zm 


« n -i — ri m ri m r 3 m r4 m r 3 m • • • r™_ t — ri m r 2 m r 3 2m r 3 m • ■ • 
a n - ri m r 2 m r 3 m n m rz m ■ • • r£_ x r„ m = ri m r 2 m rz 2m ri> m • • • ?’”L x ?’ n m 
Hence, to a high degree of approximation, the final equation is 
z n -j~ n m z n ~ l 4- r\ m r 2 m z n ~ 2 -f 2r{ a r% m r 3 m z n ~ i -f- r\ m r 2 m i'\ m z n ~ i + • • ■ =0 

Evidently, for each pair of equal roots there will be one term 
(the fourth in this case) which, as the root-squaring process is 
repeated, does not approach the square of its previous value but 
instead approaches one-half this value. In other words, when 
two equal roots are present, there is always one coefficient for 
which in the root-squaring procedure the product of adjacent 
coefficients never becomes negligible but always makes a con- 
tribution approaching one-half the square of the coefficient itself : 




n m r t m 

2 n m rt m n m 

n m r 2 m r t tm 






— 2 (ri"»r 2 m ) (r i m r 2 m r t * m ) 







2n*«r** m fi 2m 





When one or more coefficients behave in this manner as successive 
equations are constructed, the presence of double roots is always 
indicated, and the process can be terminated as soon as all but 
the exceptional coefficient or coefficients are uninfluenced by the 
product terms. 

The determination of the roots in this case, once it is recog- 
nized, is simple enough. All roots except the repeated one can 


SEC. A.l 


GRAEFFE'S ROOT-SQUARING PROCESS 


759 


be found just as before by extracting the mth root of the ratios 
of successive pairs of nonexceptional coefficients. The repeated 
root can be found by extracting the 2wth root of the ratio of the 
coefficients immediately following and immediately preceding 
the exceptional coefficient. Here again, the signs of the roots 
must be determined by subsequent inspection. 

When the given equation contains a pair of complex roots, 
necessarily conjugates of each other, the analysis is a little differ- 
ent. To investigate this case, suppose specifically that the equa- 
tion to be solved is of the fourth degree with roots — r h — r 2 e iB , 
— r 2 e~ ifl , — r 3 , whose absolute values are such that \r'i\ > |r 2 | > 
|r 3 |. The equation can then be written 

(x -f ri)(x -f r 2 e iB )(x + r 2 e~ iB )(x -j- r 3 ) = 0 

After m root-squaring operations have been performed, the 
resultant equation has roots 

—n m — r 2 m e ime —r» m e~ im9 — r 2 m 

and can, therefore, be written 

(z + r x m )(z + r 2 m e imi ) (z -j- r 2 m e~ im9 ) (z -f- r 2 m ) = 0 

or 

(6) z i + (t‘i m + r 2 m e im6 -f r 2 m e~ im6 -f r 3 m )z d 

+ (■r l m r2 m e ime _j_ ri m r2 m e -im8 _|_ _j_ 

+ r 2 m e im0 r 3 m + r 2 m e~ imS r 2 m )z 2 
+ (r l m r 2 m e imd r 2 m er im6 + n m r 2 m e im6 r 2 m + n m r 2 m e~ ima r 3 m 

4- r 2 m e ime r 2 m e~ im0 r» m )z 

+ (n m r 2 m e imB r 2 m e~ imB r3 m ) — 0 

where the coefficients have been expressed at length as the 
appropriate symmetric functions of the roots. 

In every coefficient in (6) except the coefficient of 2 2 , the 
first term is obviously the term of greatest absolute value. This 
is not the case in the coefficient of z 2 , however, for by combining 
terms this can be written 

2n m r 2 m cos mQ + r x m r 2 m + r 2 im + 2 r 2 m r 2 m cos mO 
and it is clear that, if cos md is approximately 1 or —1, the first 
term is dominant, whereas, if cos md is approximately 0, one 
of the later terms is dominant. Thus, as m increases, the coefficient 
of 2 2 continuously fluctuates in sign and does not become and 
remain positive, as it does when all the roots are real. This is the 
characteristic which identifies the presence of complex roots in 
polynomial equations of all degrees, as many coefficients behav- 
ing in this manner as there are pairs of complex roots. 

Once the existence of complex roots is recognized, it is a 
simple matter to obtain the absolute values of the real roots by 
extracting the mth root of the ratios of successive nonexceptional 


760 


APPENDIX 


( 7 ) 


coefficients, just as before. Moreover, the modulus of the complex 
roots can be found by taking the 2mth root of the quotient of 
the coefficient which immediately follows the exceptional one 
divided by the coefficient which immediately precedes the 
exceptional one. 

To complete the determination of the complex roots, for 
which at this stage only the absolute value is known, let those 
roots now be written in the form u ± iv. In the original equation, 
the coefficient of tc* -1 is the negative of the sum of the roots; 
hence, 

«i - — jq + ( u + iv) + ( u — iv) — r*'— • • •] 

and thus u — — — — ■ S - — — — 

— —Vi (coefficient of x n ~ x + sum of all real roots) 

As soon as u is determined, v can be found from the familiar 
identity r 2 2 - u 2 + v 2 . 

As we have already remarked, the presence of more than 
one pair of complex roots is indicated by the presence of more 
than one coefficient which fluctuates in sign as the root squaring 
continues. In this case all real roots can be found, just as before, 
from adjacent pairs of nonexceptional coefficients by extracting 
the mth root of their quotients. The moduli of the various com- 
plex roots can also be found, as before, by taking the 2wth root 
of the ratios of the coefficients immediately after and imme- 
diately before each exceptional one, the exceptional coefficients 
being necessarily nonadjacent if the pairs of complex roots are 
of different absolute value. The only modification required in 
this case is in the determination of the real parts of the various 
complex roots. 

To illustrate this modification, let the given equation con- 
tain two pairs of complex roots 
u\ ± iv i and uz ± iv 2 

of absolute values r x and r 2 , together with additional real roots 
— r 3 , ~~r 4, . . . . As before, the coefficient of a:" -1 in the original 
equation is the negative of the sum of the roots; hence, 

fli = — [(ui + Wi) + (u i — iv i) -f (u z + iv 2 ) + (w 2 — iv s ) 

— n — n— • • •] 

= — 2lti — 2m 2 -j- + 74 +■*'-* 

or 

+ 11 

— —/^(coefficient of x n ~ l + sum of all real roots) 

This is one equation in the unknown real components u x and Uz- 

To obtain a second equation in u x and m 2 , we make use of 


SEC. A.l 


GRAEFFE’S ROOT-SQUARING PROCESS 


761 


(8) 


the fact that the coefficient of x in the original equation is equal to 
(__!)»- 1 ( sum 0 f roots taken n — 1 at a time) 

Hence, 

a n -i — (— l) n ~ 1 [(wi — ivi)(u 2 + iv 2 ){u 2 — w 2 )(— r 3 ).( — ' r 4 ) • • • 

+ {ui + tvi)(us + - w a )(— r*)(-r 4 ) • • • 

+ (t*i + — wi)(w 2 - iv 2) ( — r s) ( r 4 ) • • • 

+ ‘(«i + Wi)(mi ~ »i)(«i + »i)(— rjK-rO • • • 

+ (wi + — iv 1 ) (m 2 4- iv 2 )(u 2 — iv 2 ) (sum of 

all products of the n — 4 real roots taken n. — 5 
at a time)] 

- (— l)“-’ 1 [(tt l — ivi)r 2 2 (product of the n — 4 real roots) 

+ (ui + ivi)r 2 ~ (product of the n — 4 real roots) 

+ («2 — iw 2 ) 1 * 1 * (product of the n — 4 real roots) 

+ (w 2 + w 2 )i*i 2 (product of the n — 4 real roots) 

+ rrr 2 2 (sum of all products of the n — 4 real roots 
taken n — 5 at a time)] 

- (— l)*‘- 1 [(2M i r2 !! + 2« ? ri 2 ) (product of the n, — 4 real 

roots) 

-f 71*72* (sum of all products of the n — 4 real roots 
taken n — 5 at a time)] 


Hence, finally, 


Mil* 2* + Ma7l* 
or 


(_ i)»-i an _i — n 2 r 2 2 (sum of all products of the 
n — -4 real roots taken n — 5 at a time) 
2 (product of all n — 4 real roots) 


Uil'2 2 + U*T x 2 


= ( l) n ~ 1 Gn— 1 

2 (product of all n — 4 real roots) 

— — — (sum of reciprocals of all n — 4 real x'oots) 


From Eqs. (7) and (8), ui and u 2 can be found at once. Then Vi 
and V 2 can be determined from the relations 

7i 2 = Ui 2 + Vi 2 and r 2 2 = U 2 2 + v 2 2 

When more than two pairs of complex roots are present, this 
procedure can be generalized by using, in addition to the relation 
<h — — (sum of all the roots) 

and the ^-coefficient relation, other relations arising from the 
coefficients of x 2 , x s , ... . However, these further equations in 
the real components ui, it 2 , u s , it 4 , . ... . are nonlinear, and 
solving them for u h u 2 , u h u i} . . . may be very difficult. 


762 


APPENDIX 


EXAMPLE 1 

Find all the roots of the equation 


Pi(x) s x 7 + x B — 4a: 5 — 4a; 4 — 2a; 3 — 5a: 2 — x — 1 = 0 


The construction of the successive equations presents no difficulty and is adequately set 
forth in Table A.l. By the time in = 128, all coefficients are uninfluenced by the product terms 
except the fifth and the seventh, which continually fluctuate in sign. We can, therefore, terminate 
the root-squaring process at this stage with the assurance that there are two pairs of complex 
roots and three distinct real roots. 

To find the magnitudes of the three real roots, we have 


log |nf 

ini 
log |n| 
in| 
log |n| 
Ini 


^(^4 8 44X10^, 

128 

2.162 


log (4.1006 X 10 78 ) - log (7.4844 X 10 n ) 
128 


0.28702 


log (7.2835 X 10 100 ) - log (4.1006 X 10 79 ) 
128 


0.16601 


1.466 


Since there is only one change of sign between successive coefficients in the original equation, 
only one of the three real roots can be positive. Since Pi (2) ~ —39 and Pi(3) — 1,517, the posi- 
tive root must lie between 2 and 3. Hence, the real roots are approximately 2.162, —1.936, 


To find the absolute values of the complex roots, we extract the 256th root of the ratios of 
the coefficients just after and just before the exceptional coefficients. Thus 
log (4.1083 X 10 79 ) - log (7.2835 X 10 100 ) 

256 

■ 0.8264 


log n = 


= 9.91700 - 


log n 


log (1) - log (4.1083 X IQ 79 ) 
256 

= 0.4887 


L - 10 


The real parts of these roots must satisfy Eqs. (7) and (8) ; hence 

-1 - [( 2 . 162 ) + (- 1 . 936 ) + (- 1 . 466 )] 

Ui + ut *» — 

«4 + u 6 « 0.120 

,,,( 0 . 4887 ). + m .( 0 . 8264 ). - 

_ ( 0 - 8264 ) 2 ( 0 . 4887) 2 / J __ 1 1 \ 

2 \ 2.162 + - 1.936 + - 1 . 466 / 

0.239U4 + 0.683w 5 - -0.021 
Solving these two equations simultaneously, we find without difficulty that 
Ui = 0.233 and m 6 = —0.113 
Hence, = y/ r 4 2 - up = 0.795 

v 6 — sj r 6 2 — W 5 2 = 0.476 

The complex roots of the given equation are therefore approximately 


or 

and 

or 


0.233 + 0.795* and -0.113 ± 0.476* 




764 


APPENDIX 


EXERCISES 

Find all the roots of each of the following equations: 

1 X 3 _ Q X 2 + n x - 7 = 0 2 '** + 2* 2 + 2s + 2 - 0 

3 *«-*«- 10** -* + 1 = 0 4 4** + 16x 3 + 25s* + 21s + 9 = 0 

8 16s 5 - 16s 4 - 12s 3 + 12s 2 -1=0 6 s 5 - 5s 3 + 4s - 10 = 0 

7 s 5 - 8s 4 + 17s 3 - 10s 2 + 10 = 0 

8 Discuss the application of the root-squaring process to an equation with a triple root. 

9 Discuss the application of the root-squaring process to an equation with two pairs of com- 
plex roots of equal moduli. 

10 Discuss the application of the root-squaring process to an equation with a pair of complex 
roots and a real root whose absolute value is equal to the modulus of the complex roots. 


Answers to 

Odd-numbered 

Exercises 


Chapter 1 

see. 1.2 

p.7 


sec. 1.3 

p. 11 


sec. 1.4 

p. 14 


sec. 1.5 

p. 18 


sec. 1.6 

p. 21 


1 Second-order, ordinary, nonlinear 
3 Second-order, ordinary, nonlinear 


6 Second-order, ordinary, linea 
19 y"' - 2y" - y' + 2?/ = 0 
23 y" - 4j/ = 0 

1 y = cx 3 

6 y = (2 - ca:)/(l - cx) 

*9 e v (y — 1) + c~* => c 

13 No. Yes; y - { + J 

l a: 3 + 1 

19 s + e -ta-i/+i = c 

1 When the coefficient of dx i; 

3 y = x + ce 2 * 1 ^-*) 

7 y 3 = a: 3 (8 - In |x|) 

11 ?/ 3 = * 3/3 ) 

17 (y - 3) 3 + 2(x + 2)(y - 3, 
19 V* s + 2/* 


r 7 Second-order, partial, linear 
21 x 2 y" — 2xi f + 2 // = 0 
25 2^" - O/) 2 = 0 

3 y — In |y -f- lj — a: 2 = c 
7 y a = ce”*/a: 

11 y - 2x 2 + 2 
x < 0 
x £ 0 

simpler than the coefficient of d;/ 

5 y 2 — cx 4 — x 2 
9 ( 2 / - 2x) 2 (y + x) « 27 
/in |ex| — l\ 

?/ \ InM ) X 

- (x + 2) 2 - e 


1 x 3 — 2/ s — 3 x 2 y = c 3 l£x* + My 3 + xhj = c 

5 2ci/ = 2x + xhy + 4 y In \y\ 7 in |cx?/| = - l/xy 

9 x s 2 / 3 = ( x/y ) + c 

11 x 2 — y 2 + 2xy = c The equation is both homogeneous and exact. 

13 xhy 2 — x 2 — y 2 = c The equation is both separable and exact. 


1 


3 

7 



The equation is also exact. 


5 y 




2 sin 2 x ^ 1 

3 3 sin x 


766 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 1 .7 
p. 25 


Chapter 2 

sec. 2.1 
p. 35 


sec. 2.2 
p. 41 


9 y — %x 2 + car 1 '* 

11 x - + cy ~ 3 

i3 » - * + Vi'+T* 
17 jr 3 = a: 2 + cx 


The equation is also homogeneous. 

15 y = | x<t x < 0 

J l a; 2 - 2x 3 i§0 

19 i/ 2 = x — ^ ~f ce~ 2z 


1 

6 

9 

11 

13 

16 

17 

19 

21 

26 

27 


31 

33 

36 


37 

39 


p = I4,7e~o.oooo385fc 3 4.27 per cent; 2.91 per cent 

2.4 times 7 Q = 30 - 1500/ft + 50) 

242 days 

w = o) a e~ k,lt The flywheel will never come to rest! 
v - (w/m - e ~ hatlw ) ; s =JmA)[( - (w/* ff )(l - e-* 3i/ “)] 

— t 
myo 

v = — xo sin < x = x 0 cos >/ < 

Q ■- 2(100 - 0 - 150[(100 - fl/100] 3 0 ^ t ^ 100 

Q « 20 - 960/ (< + 48) 23 T - 20 + 80e-°- 06S3( 

Q = (To — Ti)k/h, where k is the thermal conductivity of the material 
of the wall; T = T„ — (T 0 — Ti)x/h 
4Wi(T 0 - Ti) 


,_j 2?/ - t/o + — 


Q 

rial of the sphere; T 


> where it is the thermal conductivity of the mate- 
nTx - nTo r a n(Ti - T 0 ) 1 


i - (J5/B)(l - 0.693L/J2 

i ■» — -7- (i2 cos wf -f- X sin co< — Re~ RtlL ) S — tan -1 — 

f? 2 + u 2 L* xt 

y — foe / ” rr « Jz/2 ’ {r 

t - (2h)?5 - (2y)% - + (2y - fc)M !/^ 

T ^ ~ thl2) = Vzh - 2-\/y V <\ 


where f/,/2 is the time it takes the tank to drain to a depth of /i/2. The time 
for the tank to drain completely is 
2H + 1 Ih 

— yv 1= — The tank will never be completely empty 1 

V 0 \A 3,rr2 

x 2 = — y 2 ln |cy| 


1 a y = ci sin x -f c 2 cos x b y — Cie~ x + de~ 2x 

c y - cie x 4- c 2 xe~ z d y — ci(x — 1) + c 2 ( — x 2 + x — 1) 

e y = cix + czx~* f y — CiX + c 2 xe z 

1 Dy is a function, namely, the derivative of y, whereas yD is merely an 

operator. 

3 (D+x)(Z> + 2x) = (2x 2 + 3x + 3)e* (D + 2x)(D +x) = (2x 2 + 3x +2)e z 
These expressions differ because, in permuting the operational coeffi- 
cients, variable terms are moved across symbols of differentiation. 



ANSWERS TO ODD-NUMBERED EXERCISES 


767 


sec. 2.3 
p. 49 


sec. 2.4 
p. 51 


sec. 2.5 
p. 55 


sec. 2.6 

p. 62 


6 

?/ = Cie 1 

4 c 2 e _2x 

7 

y = Cie -|- C^"^ 5 * 

9 

V = cie~ 

1,2 4 c 2 xe x/2 

11 

'•(^«o4o + Bsin fo / 

13 

17 

by — 6c 
y — 3xc 

~ 4x 4 14C 

16 

4y = e 21 4 3e~ 21 


19 There is no solution satisfying the given conditions except y = 0. 
26 Yes 


1 

3 

6 

7 

9 

11 

13 

17 


3 

5 

7 


11 

1 

3 

5 

7 

9 

11 

13 


y - cie~* + c 2 e~ Sx 4 (3x - 7)/9 
V = ci + c 2 e~* + x 4 x 2 /2 

2/ = e~*(A cos 3a: + B sin 3a;) 4 (75a 2 - 30a - 9)/250 
y = ciC + c 2 e~ x 4 }4xe x 4 Me 21 
y = ci cos x 4 c 2 sin x 4 \ie x {— 2 cos a: 4 sinx) 
y - cie~ x + c 2 xe~ x + H - Mo (3 cos 2x — 4 sin 2x) 

A = 1 16 y - c~ 2 *(6 sin x — 2 cos a) + 2e* 

In the limit when &>-» k, Y becomes — (i cos let) /'2k, which is the par- 
ticular integral that would have been obtained by applying the methods 
of this section to the equation y" 4 khj — sin kt. 


y ~ cie~ ix 4 csxe~ ix — e~ 21 In |x| 
y = eje"* + c 2 xe~ x 4 Mx 2 (2 In 1*1 — 3)e~“ 


1 


1 , 


Y = — l A x In 2 |x| — a: In |x| — x 

y = + c a e~ ( “~ w * 4- — J Q [e -0 *"® 

y = cifi~ ax 4- dxe~ c 


-»)J/(s) ds 


+ r 


(x - s)e-“ l *-*)/(s) ds 


y = cie - * 4- c»e-- x 4- c t e~ 3x 4 x — 3 
y = cit: -21 4- e 2 *(c 2 cos x 4 ca sin x) 4- 3 cos x — sin x 

9x 2 4- 16 sin 2x 

y - ci e x 4 c 2 e~ x 4- Ca cos 3x 4- c t sin 3x - 


81 


25 


= ci 4 c 2 e* 4 e~ xl2 [A cos (V» */2) 4 B sin (V 3 x/2)J - x 3 /3 


V = 


4 e -2i _ -|- s 

~60~ 


- 2 sin x 


10 


V — }s(2e 21 4 3 cos x 4 sin x) 

Y = i lo iex ~‘ ~ 2e2(I ~” ) + e3(i_,)) ^ s) ds 


1 y = cix 4 c 2 x In |x| 4 C3/X 


: + V * +_2 + 2 


7 2tt ■%/ hw/pg, where p is the density of water 
9 r = a cosh « t 

I Xq 2 (Xq — 3xQ 


11 In each case the deflection is equal to 


6 El 

xi 2 (xi — 3xo) 
\ 6 El 

(2 n 4 1)VEI 
4L 2 


768 


ANSWERS TO ODD-NUMBERED EXERCISES 


Chapter 3 

sec. 3.2 

p. 72 


sec. 3.3 
p. 78 


15 The critical speeds and the associated deflection curves are 


ZrpEIg 

ApL* 


y n = (cos z„ - 


h Zn) ^COS Z„ ~ — COsh Z„ 

+ (sin z n + sinh z„) ^sin ~ — sinh z n ^ 


where z» is the nth one of the roots of the equation 


cos z cosh z = 1 

L 1 fcr8 g 

* ~ 2 irMWRt + Ig 

21 y = a cosh •y/ < 

26 « —W 


19 


(b) 


2 / 3/jLV 

2t'\ 3WL 2 +wL* 

1 l 3(WL + 

2ir\ 3 (IT + w)£ 2 

2 SIl±2S 
2» V W. 


1 * = 4, y * He -4 - 5 
3 x = c ie‘ + c 2 e -24 — 

y = - 6c 2 e -24 + e-‘ + 15) 

5 x « Cie -1 + c 2 e _!! ‘ — (14 cos 2i -f 23 sin 2t)/29 
y - H[2cie“‘ + c 2 e~ 24 ~ (88 cos 2< + 104 sin 2i)/29] 

7 a: = e~‘(A cos < + B sin t) 

y = — ^e~‘[(A + B) cos t — {A — J3) sin <] 

9 a: «* ci cos t + c 2 sin t + c 3 cos 2< + c.t sin 2 1 + 2 Mo cos 3i 

V - H (2ci cos t + 2c 2 sin t — 7c 3 cos 21 — 7 c 4 sin 2 1 — 1 H cos 3 1) 

11 x - (ci + l)e -4 + c 2 e“« + llc 3 e 44 
y = -2cje~‘ + c 2 e -44 + 3c 3 e 44 + 1 
z = cie“‘ + c 2 e -44 — 29 c 3 c 4< 

13 x = e 4 , y = e 4 , and z = e 4 

16 a: = Cie -4 + c 2 e 4 + jf sinh (* — s)[— z'\s) — z'(s) + z(s)] ds 

y — Cie~‘ — c 2 e 4 + sinh (a: — s)[— z"(s) + 2z(s)] ds 

z = z (z arbitrary) 

17 (57) - 4)z - (4 D - 5 )y = 0 (D 2 — 2£>)a: + {D — 2)y - 0 

19 Qi = 100(1 - e _4,1 °), Q 2 - 100(1 + e- 4 ' 10 ) 

lx- Cie~ 3t ~ H,y - Cie~ 3t + M 
3 x — cie 4 + 2c 2 e 24 + t, y = — Cie 4 + c 2 e 24 + 1 
5 x — —2ci + c 2 e -84 + Me 4 + 
y - Ci - 3 c 2 e~ 3t — J-^e 4 + J-^e -4 

7 x — —2ci cos £ — 2c 2 sin £ + c 3 cos 2£ + c 4 sin 2£ + H(5 — 3£) 
y — 3ci cos £ + 3c 2 sin £ — 3c s cos 2 £ — 3c 4 sin 2£ — K(7 — 5£) 

9 x — 8ci cos 2t + 8c 2 sin 21 + c 3 cos 3f + c 4 sin 3 1 + 2 Ma cos t 

y = — 7c t cos 2t — 7 c 2 sin 2< — c 3 cos 3< — c 4 sin 3i — 2 ^3 sin t 



ANSWERS TO ODD-NUMBERED EXERCISES 


769 


Chapter 4 

sec. 4.1 
p. 89 


sec. 4.2 

p. 97 


sec. 4.3 


p. 106 


3 (a) P(x) = 1 + 3(x) (2> + (x) (3) . The difference table can be con- 

structed by addition from the leading differences 
P(0) = 1 AP(0) = 0 A 1 2 P(0) = 6 A 3 P(0) = 6 

(b) P(x) = — 2(x) + (x) (2) + 4(x) l3 > -+• (xjw. The difference table can 
be constructed by addition from the leading differences 

P( 0) - 0 AP(0) = -2 A 2 P(0) = 2 A 3 P(0) = 24 

A 4 P(0) = 24 

(c) P(x) = 6 + 2(x) + 13(x) (2 > + 17(x)< 3 > +- 8(x)«> + (x)<«. The dif- 
ference table can be constructed by addition from the leading 
differences 


P( 0) = 6 AP(0) = 2 A 2 P(0) - 26 A 3 P(0) = 102 

A 4 P(0) = 192 A 6 P(0) = 120 

9 (a) F(x) = (x)~< 2 > - 2(x)~ t3) ; (b) F(x) = (*)-<» - 2(x)-(»; 

(c) F(x) = (x)~™ - 4(x)~ (2) + 4(x)-«> 

13 f(x„,x h ~ ' - ^ ! f(Xi) 

n ~ *<) 


3 (a) 1.338; (b) 2.819 6 -** .+ 6x* + * -' 2 

- Xo + X* f(Xo,Xi) 

» Xmax — _ ”” ~Z7, C 

2 2/(xo,Xi,x z ) 


J/m»* 


[/(X(|,Xj) — /(X 0 ,X a ) +/(Xl,X2)] 2 

4/(Xo } Xi,Xa) 


1 /'(200) = 0.00500000 

/"( 200) = -0.00002499 
/'"(200) = 0.00000024 

f'(205) = 0.00487806 

/"(205) = -0.00002377 
/ ,,, (205) - 0.00000025 

3 *£(» + l) 2 n 2 


X 

f l sin x , 

]. ~r dx 

0.0 

0.946 

0.1 

0.846 

0.2 

0.746 

0.3 | 

0.647 

0.4 

0.550 

0.5 

0.453 


exact value = 0.00500000 

exact value = —0.00002500 
exact value = 0.00000025 

exact value = 0.00487805 

exact value = —0.00002380 
exact value = 0.00000023 

5 7.486 


* 

/,'T* 

0.6 

0.356 

0.7 

0.265 

0.8 

0.174 

0.9 

0.086 

1.0 

0.000 


9 


13 


When the first difference correction terms are taken into account the value 
of the integral is 0.31028. When the correction terms through the third 
differences are taken into account the value of the integral is 0.31027. 




Pft 35*/q 
24 + 640 ' 


(a) Co = c s = M; Ci - c 2 - % 

(b) C 0 = C t = WiS’t Cl *= Cl = 0 ii 5 ] Ci = 2 f4 B 


15 


770 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 4.4 

p. 116 


sec. 4.5 
p. 125 


1 SI* - -1.7378; y> = -2.1503 
3 yi = 1,1103; yi = 1.2428; y 3 = 1.3997 

6 1/1 = 0.1050; ya » 0.2198; y 3 = 0.3445 
zi = 0.9998; z 2 = 0.9986; z 3 = 0.9955 

7 Like Milne’s method, the Adams-Bashforth method inquires that the 
first few values of y be computed by other means, say by the Runge- 
Kutta method or the evaluation of a series expansion for the solution. In 
particular, in the present problem only the values after ?/(1.4) can be 
obtained by the Adams-Bashforth method. 

i/(l. 1) = 1.0048 3/ (1,2) = 1.0187 y{ 1.3) = 1.0408 

y(lA) = 1.0703 y(1.5) = 1.1065 y(1.6) = 1.1488 

,901 , 1,387 , 436 , 637 


9 Vn+i - y*. + h 


/ 1,901 
\ 720 


V»-~ 


~ Vn~l + 


120 


Vn-2 ~ 


360 


Vn-l 


, 251 , 
+ 70 n y*- 


, ,/251 , 323 

y.u-y.+>‘i KWo v.»+ m y 


19 

" 720 V 


(open formula) 


(closed formula) 


X 

y (by numerical 
solution) 

y (from exact 
solution) 

0.0 

1.0000 

1.0000 

0.1 

1.0052 

1.0052 

0.2 

1.0214 

1.0214 

0.3 

1.0499 

1.0499 

0.4 

1.0919 

1.0918 

0.5 

1 . 1488 

1 . 1487 

0.6 

1.2222 

1.2221 

0.7 

1.3138 

1.3138 

0.8 

1.4256 

1.4255 

0.9 

1.5597 

1.5596 

1.0 

1.7184 

1.7183 


s Vo + { ~ Vo + Vi — 




y'ih* \ 

6 J 





Vi = ~Va + 2yi + y"h 2 

An accompanying closed formula can be obtained by fitting a polynomial 
to y" (which can be obtained from the given differential equation as soon 
as an estimate for y 2 is available) and any three of the data y 0 , i/o , yi, i/i • 


1 


3 


(a) y = ci( — 3) x + c»(— 4)*; (b) y = (-3)* + c 2 m( — 3)^; 

(c) y = 2 xl2 (A cos %irx + B sin %tx)] (d) y = c t 2 I + e 2 3 x 

(a) y = Ci3 x + c 2 ( — 2) x — — (6* + 1) + — x 3 1 
36 15 


(b) y = A cos — + B sin— + 


sin x + sin ( x — 2) 
2(1+ cos 2) 



ANSWERS TO ODD-NUMBERED EXERCiSES 


77 \ 


sec. 4.6 

p. 142 


I sm (n - x)n 

Vo fi 

1 sm nn 

J n 

f sinh (n - x)n „ 

— — Vo fx 

' smh nil 

/ 1 + » 

\ Bin (« + Dm 

) sin n 

) (~D"(1 +n) 

I ^_ 1 ^„sinh ( n + 1 ) ft 


k < H 
k = H 
k>H 

X = 2 
-2 < X < 2 

X = -2 
X < -2 


sinh fi 2 

IB (a) The only solution is y — 0. 

(b) y - $( x ~ 1) 

(c) Because a 2 = 0, the equation is actually of the first order, and its 
complete solution contains only one arbitrary constant: 


'(.-if 

'-*(-if 


Oo t* 0 


(d) i/ = A ( ) + '!■(£ — I), where 4>(;r) is a particular solution of 

the nonliomogeneous equation (a n E + m)y = <p(x) 


1 (a) y = (2 + 1 33ar)/102; (b) y = (-68 + 187a)/l33 

3 (a) * - 1.683, y - -1.847; (b) x = 1.739, y = —1.811 

B (c) X ~ \inP nd(x) — VlllPn i(x) 

:c 2 = M(2rP + n)P n0 (x) - ^n a P„,(x) + H ( »* - n)P„ 2 (a:) 

7 (a) .1 = 1.000, a - 0.499 

(b) Using the results of part, a as an initial approximation, linearizing 
via Taylor’s series yields A = 0.995, a = 0.498. 

11 After a has been found, A may be determined by applying the method 
of least squares to the equations 
y i = ?/2 = Ae ax ‘, ...,?/„ = Ae a *» 

in which .4 is the only unknown. This method is generally to be preferred 
to linearizing by taking logarithms, because it does not introduce any 
unwarranted weighting of the data. It is clearly preferable to both this 
and the use of Taylor’s series on grounds of simplicity. 

13 „ - (s - - “) *• - 0.980 - 0.418*. 

15 p — x cos 0 + if sin 0 


Chapter 5 

sec. 5.3 
p. 163 


5 

13 

17 

19 

21 


The first integer equal to or greater than 


In 2 y / 1 — (c/cc) 2 


C/Cc 


tmax = l/« n ; y„ mx = V(,/o> n e 
y - — c _4 ' 2t (2 cos 14.4< + J{ 2 sin 14.4<) 
y = — g-«(2 cos 8i ! + M sin 8f) + 2 

Period — r ~ 2?r sj a/ixij, where 2a is the ‘distance between the 
the rollers. From this n — ‘iw^a/gr*. 


of 


772 


ANSWERS TO ODD-NUMBERED EXERCISES 


23 The maxima of the magnification ratio curves of Exercise 22 occur where 
co/wn = l/y / 1 — 2(c/c c ) 2 , which is always greater than 1 for 0 < 
c/ce < l A, that is, whenever a maximum exists. 

26 The amplitude of the steady-state response varies between the values 



sec. 5.4 

p. 170 


1 Across the resistance: B = iR — 40(e -20w — e~ 800i ) 
di 

Across the inductance: E = A-~ => 8(— e -200f + 4e _80D< ) 
at 

Across the capacitance: E — ~ jf idt - 8(3 — 4e~ 200 * + e" 800t ) 


3 

7 


3,12515 
2 6 
3255 _ a 
2 ® 


6 i„ = 0.14 cos (120 tt5 + 19°40') 
9 t - 0.00039 sec 


13 \Z\ is a minimum (or \/\Z\ is a maximum) for the undamped natural 
frequency — 1/v LC. For the magnification ratio the maximum 
always occurs at a frequency below the undamped natural frequency. 
ThiB involves no contradiction, since the magnification ratio M relates 
F and y, whose electrical analogues are E and Q, whereas the impedance 
relates E and i — dQ/dt. 


sec. 5.5 

p. 178 


1 1, 2 3 1, s/5 

6 i, = }>i (cos t + 2 cos 25), = M(cos t — cos 2<) 


7 6 a/6, 12 y/l 


Q 0 


;( 168i ° 


60 \JlC 0114 3 s/LC + Sm 12 y/LG t 

Qo / A . t . t \ 

20 \/LC \ Sm 3 y/LC 12 \ACV 

11 ii = E 0 \{y /--sin — — = — sin — sin — ~=^ 

V A \ 80 yJhC 240 3 y/ LC s/hc) 

' (C /27 . St 1 . i . i \ 

n = Eo \ hr ~ sm — + — sm p= - sm — 7= 

\A\10 LC 30 3 -\/ LC y/hC) 

13 = —%= sin — N - 1, 2, . . . , (n - 1) 

V AC 2n 


i) 


n fk Nr 
15 „„_2 A /-».n 5STT) 

■ /T . iVx 
17 wjv = 2 \ /— sm - — - — 
VI 2n + 1 


A = 1, 2, . . 


A = 1,2, . . . ,« 



ANSWERS TO ODD-NUMBERED EXERCISES 


77 3 


19 (a) Q k 


(b) Qt - ( 


E 0 jc sin (71 
to \ L cos }■ 
E 0 fc 


(re - fc + 1 )m 


&(2» + Dm 

fc sinh (re — k + 1)m 
sinh. ^(2n + 1)m 


Chapter 6 


sec. 6.2 

p. 188 


sec. 6.3 
p. 195 


1 a 0 = H 


3 an s 
5 ao = 

a« E 

7 a n = 
9 On = 
11 a„ « 


l/nir 
— l/nir 


0 

2 

0 

' 0 
4tt j 


7r(4n 2 - 1) 


(-1)» +1 4 


re = 2, 4, 6, . .. . 
re = 1, 5, 9, . . . 
ti = 3, 7, 11, . . . 

/ 1/wir 
b n — t 2/reir 

u 

71 = 1, 2, 3, . . . 


f S/nir 

\ 0 


= 1, 3, 
= 2 , 6 , 
= 4, 8, 


= 1, 2, 4, 5, 7, 8, 
■ 3, 6, 9, . . . 


-2/tmt 

-4/nir 


71 5 s5 0 bn — 0 

n = 1, 3, 5, . . . 
re - 2, 6, 10, . . . 

» - 4, 8, 12, . . . 

0 n odd, 7i 

2 

7i even 
re odd, 7i 


t( 1 — w 2 ) 

0 

2 re 

l t( 1 - re 2 ) 

_ 47rre(l — er 1 ) 
n = 1 + 4reV 

. 2rejr 3 ( 

“7 ” j£S*V " 

1 271TT 3 

S 0O5 7 + w“ 

re 3 *- 3 nV 2 


re 3 *- 3 

16 4 

re 3 *- 3 re 2 *- 2 


2rejr 

a ___ 

= 1, 5, 9, . . 
= 2 , 6 , 10 , . 
= 3, 7, 11, . 
= 4, 8, 12, . 



774 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 6.4 

p. 200 


sec. 6.5 
p. 205 


9 Yes. In particular, the Fourier expansion of 


m = { 


£ - £* 

t - £* - 4£ 3 - 2£ 4 + k(t + 1 )H 3 g(t) 


0 S £ g I 
-1 S £ S 0 


where g(t) is any function possessing a continuous second derivative, 
will converge to £ — £ s for 0 ^ £ ^ 1 and will have coefficients decreasing 
as 1/n 4 . 

11 Since 1/(2 + cos £) possesses derivatives of all orders at all points in the 
interval 0 ^ £ g 2ir, the Fourier coefficients of this function will decrease 
faster than the reciprocal of any fixed power of n. (See Exercise 29, 
Sec. 16.2.) 

15 a n will decrease faster than 1/n 1 2 provided that, for all values of n, 


^ [f «i + ) - f(<r)] cos i = 0 

where {£<) is the set of points of discontinuity of f. 

b n will decrease faster than 1/n 5 provided that, for all values of n, 

Tirm -m-)] sm— f = o 

i P 


where (£<) is the set of points of discontinuity of /'. 


i ( 1 ~ cos H n ^ \ 

\ ) 

fc-toa-f -""-1 ^ - -) 

yl — cos Yintrf 


-- 
\ n. 1 r 

7 „ = tan -1 


— ( 1 — cos ^ ) n even 

nir\ 2 J 

n odd 

/l — 2 cos \in-K + cos ?i7r\ 
y sin Yn-K J 


S n = tan" 

■ H 

~ Co = H 


(r 


2 cos Ynir + cos nirj 


2v 1 — 4n 2 


1 The complete solution will originally appear in the form y = cryi + 

ctyi + Y, where Y is the Fourier series obtained as the answer to Exam- 
ple 2. Imposing initial conditions of displacement and velocity will thus 
lead to a pair of simultaneous linear equations in ci and c« in which the 
constant terms will involve the infinite series which result from the 

evaluation of Y and Y' when £ = 0. Although there is no theoretical 

problem in determining ci and c 2 from these equations, the arithmetical 

complications are obvious. 



ANSWERS TO ODD-NUMBERED EXERCISES 


775 


sec. 6.6 

p. 210 


sec. 6.7 

p. 220 


3 y„ = 0.190 sin (2irt - 3.2°) + 0.142 sin (M - 22.3°) 

+ 0.047 sin (lOirt - 159.8°) + 0.011 sin (14*4 - 171.2°) + ■ • • 
5 y,. = FolH + 0.225 cos (tt t - 3.2°) + 0.171 cos (3ir t - 22.3°) 

+ 0.056 cos (5 vt - 159.8°) 4- • • • 


7 



iE 0 e 

200[600n7r + i(200nV - 1,250)] 


(Note: The term corresponding to n = 0 is to be omitted.) 


1 f(x) = 2.445 - 0.939 cos x - 0.378 cos 2x - 0.230 cos 3x 

- 0.167 cos 4x - 0.133 cos 5x - 0.113 cos 6a; - 
3 T max = 59.46 - 20.99 cos t - 0.42 cos 2 t + 0.17 cos At - 0.02 cos 51 

+ 0.06 cos 7 1 - 0.08 cos 8t + 0.01 cos 104 

- 8.94 sin t - 0.20 sin 2 1 + 0.20 sin 3 1 + 0.43 sin 4 1 

- 0.07 sin 5 1 + 0.02 sin 7 1 + 0.03 sin 9 1 

- 0.05 sin lOt 

T m in = 45.08 - 19.66 cos t - 0.27 cos 2 1 + 0.89 cos 3t + 0.04 cos At 

- 0.06 cos 5 1 — 0.04 cos 7 1 + 0.04 cos St 

- 0.05 cos 9< + 0.02 cos 10< 

- 9.41 sin t + 0.28 sin 2 1 - 0.20 sin 3< + 0.36 sin 4< 

+ 0.18 sin 5 1 - 0.17 sin 6< - 0.25 sin It 

- 0.07 sin 8 1 — 0.03 sin 9t + 0.09 sin lOi 
6 If m is so large that all coefficients after b m are negligibly small, the for- 
mula of the exercise expresses b m in terms of selected values of the given 
function which either will be known or can easily be found. By repeating 
this process for decreasing values of m, using previously computed coeffi- 
cients where appropriate, the coefficients down to and including 6i can 
be approximated. 


1 


3 


(a) 

a n = 

(b) 

a.- 

(c) 


(d) 

On = 

(a) 

/(i) = 

(b) 

m = 

(c) 

m - 

(d) 

m - 

(e) 

m - 

(f) 

m - 


2 1 — cos (j>„ 


- Aeo b n = 0 


i 7“ cos cat 
: JO a 2 + w 2 ' 




1 +a 
a cos cot + u sin u 
a 2 + to 2 

J r« sin irw . 

sin oit act> 

0 1 -co 2 


2 [<* COS Hirco 

= — I cos cot dco 

TT JO 1 - CO 2 


- 1 - r 

jr Jo 

4 /•“ sin w 

= 7T Jo 


sin (o(l — t) + sin at 



(On 


IT 

Aco = — 

V 
tr 

Aco = — 

V 



p 


COS cat dco 


776 


ANSWERS TO ODD-NUMBERED EXERCISES 


5 

9 

11 


m 


2 [uq 1 — cos u 


- [(* - 1) Si co„ (* - 1) - 22 Si w 0 < + (< + !) Si <o„(* + 1)] 


■u 


1 — cos w ( b — co 2 ) cos to* + aw sin ait 

^ (6 - «*)* + aW 

-7 f M(oj) f f(s) cos [cos — ut + a(w)] ds da 
irk JO J- <* 


where k is the modulus of the spring in the system, M(u) is the magnifi- 
cation ratio, and <*(«) is the phase angle. 


Chapter 7 

sec. 7.1 
p. 232 


sec. 7.2 
p. 236 


sec. 7.3 
p. 241 


sec. 7.4 
p. 253 


6 


No; for instance, the abscissa of e~‘ is a 0 
convergence of J q e~ l dt is ai — 0. 


— 1, whereas the abscissa of 


1 £{/<»>) = s»£{/} - ^ s»-i->7«>(0 + ) 

3 = 0 

9 I f.T(f') and T(f") are not to involve the evaluation of / or any of its 
derivatives, it is necessary that 


K(s,a) = K(s,b) = 0 


and that 


dK(s,t) 
dt t*=<i 


dK(s,t) 
dt c = 6 


If <£(s,2) is an arbitrary differentiable function which is bounded at * = a 
and at * = b, these conditions are met by any kernel of the form 


K(s,t) - (t - a) 2 (* - 5)V(s,*) 


s l/l a \ 

1 s 2 -5 2 3 2\s s 2 + 4&y 

6 (a) «-*; (b) Ht 3 ’, (c) M sin 3*; (d) 2 cos 3* + sin 32; 

(e) M(3e«-e-‘) 

7 y = H(~ 2e-“ - 3e"‘ + 5e‘); z = + 2e"‘ - «*) 

9 (a) r(H) = 1.7725; (b) 2; 

(C) H + r($<) + r(«) ■- 2.1291 ; (d) r(c + D/(ln c> +1 


13 


s — 2 

s a s 2 

* 4(s + 3) 


17 cot -1 


(s 2 + 6s + 25) 2 
s + 3 


3 

7 

11 

15 

19 

23 


1 + 2s + 2s 2 

2 e~ 

s 3 

1 + e-'* 


s 2 + l 

cot" 1 ~ + s In 


K* 



4 


(s 2 + 6s + 13) 2 



ANSWERS TO ODD-NUMBERED EXERCISES 


777 


25 

29 

33 

37 

41 


t - 1 + 
0 1 - 

2 


u(t - 1) 


27 ^ sin 2(t — 2)u(< - 2) 

p-bt _ x>— a< 

31 


1 + e~‘ — 2 cos < 


36 


ft sin < 

Jo ~ d 


^ite~ 2t sin t 39 sin t — t cos t 

f(t) , and /"(<) are each piecewise regular and of exponential order; 

f(t), and are each piecewise regular and of exponential 

order. 


/<“>( 0+) - lim s[s n £ (/} - y s n -i-iyo)(0+)] 

A 

43 H(7e -< + 2fe"< - 3e" 3 ‘) 

46 e~t - e~« + HU - 2e~«-» + - 1) 

47 ai n t 


sec. 7.5 
p. 269 


7 

9 

11 

13 


H(5e-‘ - 18e~ si + 15e- 3 ‘) 

J.<j 6 (3 cos t + 4 sin t — 3e -a — 10fe~“) 
Ho[3e‘ - 5e~‘ + 2e~“(cos t - 2 sin <)] 


= H(2e 2£ - 3e‘ + e~0 + X(3 + e s <‘- 2 > 

* Ml( 1 + 2< + < s )e"‘ - cost- sin f] 

_ t 4. _ if 4. 14 + 4t - t 

* “ 2 + 3 ~ ° + ° 6 


- 3e <-s - e~«-v)u(t - 2) 


§ « - »’ + [| + — ^ ] «"“■“} »« - u 


y - 


< 8 I 4 < 2 | 6 + 2t 

2 + 3 + 3 



« ~ l) 2 
6 


(s +L r) 11 


- coth - 


1 — (1 4* as)e~ a ‘ 


s 2(l _ e -«aa) 

~H[<h(.r, 1,2) - * 6 (f,l,2)] + M[* 6 (t,2,2) - *.(<,2,2)1 

+ Mo[*.(r,0,l,2) - * 9 (<, 0,1,2)] + Hol*T(r, 0,1,2) - *t(<, 0, 1,2)1 
[(-l) n *.(r,l,l) + *.(<,1,1)1 “ t(-l) n *«(r,2,l) + *.(<,2,1)1 

- [(-l)»*i.(r,2,l) + *»(<,2,1)] 
y = —e~ ( + 2e~ u + [(-l) tt * 6 (T,l,2) + *.(<,1,2)] 

- [(-1)’** 6 (t,3,2) + *.(<,3,2)] 


13 Proceeding naturally (though somewhat incautiously) we obtain at once 



and y — <t>t(t,ir) — [( — 1) b * 8 (t,0,1,t) + *a(<,0,l,ir)] 
However, 


, ' , cos (a: + it) + cos x 0 

2(1 + cos it) ~ 0 


(I) 


778 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 7.7 

p. 280 


The reason for this is that the term of lowest frequency in the Fourier 
expansion of the driving function /(£) duplicates a term in the comple- 
mentary function A cos t + B sin t, and it, though not the rest of the 
terms, must be handled in a special way when the necessary particular 
integrals are obtained. To circumvent this difficulty, it is convenient to 
write the second term in £{?/} in the form 


- T-T— 2 (1 “ «"'• + - • • •) 

1 + s a 

Then, taking inverses, term by term, we find 
y — tt ) — r)eoa t 


sin 2 1 — 2t o 
~~ 16 
J4{sin 3 1 + 3 1 cos 3f)e~“ 


3 S(t) - 2e~ u 


7 7'= X<T“ x /(< - X) d\ = £ ( t - X)e-“«- x >/(X) d 


13 A(t) = Hi X 10~ 3 (2e -2 ' ooot - «-*■« ««) 

Kt) - — 1 Hie~ s,om + Hae " 1,000 " 6 

ii(t) = j* (_i% ie - 2 .ooo\ + i.^ 2e -i.ooox/o )E(t - \)d\ 
19 (.) AO ^“ Si 

(b) u(t -a) *m-- 


' 0 

I 


/« - X) dX 0 SflSf 


(c) *"•*«» = 


(m + n -j- 1)! 


Chapter 8 

see. 8.2 7 (b) No 

p. 293 


sec. 

P- 


8.3 

299 


1 y(x,t) = H(1 — I* — ai|)[u(3: — a< -(- 1) — u(x — at — 1)] 

+ H(1 — I® + a^|)[u(a: + at + 1) — u(x + at — 1)] 
3 H cos (a; — at)[u(x — at + ir/2) — u(x — at — tt/2)] 

+ cos (.r + at)[u{x + at + tt/2) - u(x + at — tt/2)] 
5 y(x,t) =Vi(x — a<)e~ ( * -0() M( x — at) + {x — at)e x ~ al u{—x + at) 

+ + a<)e -(l+at) u(x + at) + (x + at)e x+at u( —x — at) 

7 0(x) = - (1 + cos x) x < 0 
9 0(x) == (1/a) cos x[u(x + w) — u(x — v)] 

11 If f" = 0, that is, if / is a linear function, then the given equation is 
identically satisfied without restriction on X. If / is an arbitrary, twice- 
differentiable, nonlinear function, substitution into the given equation 
yields (aX 2 + &X + c)f" = 0, and this will be satisfied identically if and 
only if X is a root of the quadratic equation aX 2 + b\ + c = 0. Accord- 
ing as 6 2 — 4ac is greater than, equal to, or less than zero, this equation 
will have two, one, or no real roots, as asserted. 
x — at = cij x + at ~ d 


13 


ANSWERS TO ODD-NUMBERED EXERCISES 


779 


sec. 8.4 
p. 309 


16 The two-dimensional wave equation has solutions of the form 
u(x,y,t) = f(x — at) + F(x + at) + g{y — at) + G(y + at) 
and also of the form 

u(x,y,t) =f(x + y — s/2 at) + F(x - y - \/2 at) 

+ g{ — a; + y — -\/ 2 at) + G( ~x - y - -\/ 2 at) 
where /, F, g, and G are arbitrary, twice-differentiable functions. 

1 J Q f(x) dx = 0; jf* g(x) dx = 0. Physically speaking, the first condition 
implies that the integrated initial angular displacement is zero, which 
will always be the case if the origin of 0 is suitably chosen. Since the shaft 
is of uniform cross section, the second condition implies that 

j* I6(x, 0) dx = 0 

which is precisely the statement that the total angular momentum of the 
shaft is initially (and hence permanently) zero. In other words, 
J Q g(x) dx = 0 implies that the vibration being studied is not super- 
posed on a uniform rotation. 

3 (a) Yes; (h) yes; (c) yes; (d) yes; (e) no; (f) yes 


0(x,t) ■■ 


t AnS 


nirat 

s __ 


where 


( 0 n 

. { 81 2 

\ nV n 


odd 


Doubling the tension multiplies the frequency by s/ 2. Because it is 
easier to change the length quickly and accurately than it is to change 
the tension. 


„ , v V _ . mrx . 

9 y(x,t) = ^ sm ~J~ sm 


nvat 

T 


11 y{x,t) 


”(t: o 


where . 

n ^ 1 

13 u(x,t) = “ + ^ c 

where A„ = 


n even 
n odd 


nirx 

3 ”T 


' 0 n even 

. — 4« 0 /nV 2 n odd 

16 The normal modes of a uniform shaft of length l vibrating torsionally 
with its left end fixed and its right end free are given by the formula 


(2n — l)7ra: 


n = 1, 2, 3, . 


and the corresponding natural frequencies are 4Z/(2 n — l)a. For a 
similar shaft of length 21 vibrating torsionally with both ends fixed, the 
normal modes and natural frequencies are, respectively, 
mvx 4 1 

sin—— and — m — 1, 2, 3, . . . 


778 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 7.7 

p. 280 


The reason for this is that the term of lowest frequency in the Fourier 
expansion of the driving function f(t) duplicates a term in the comple- 
mentary function .4 cos t + B sin t, and it, though not the rest of the 
terms, must be handled in a special way when the necessary particular 
integrals are obtained. To circumvent this difficulty, it is convenient to 
write the second term in £{y\ in the form 


-a - 


'* + e~‘ 


•). 


1 + s 2 

Then, taking inverses, term by term, we find 
y = <fo(t, x) — 0i(<,x)cos t 
sin 2 1 — 2 1 cos 2 1 


16 


$00 - 2<r 2i 


5 J4{sin Zt + 3* cos 3t)e~ u 

7 7'= jf‘ Xe- aX /« - X) d\ - jf* ( t - X)e-“«- x >/(X) d\ 
1 - 2e~* + e -2 


11 AC*) ■ 


h(t) = e'* — e 


13 A(t) = Xi X 10- 3 (2e- 2 - 00w - c — 1 * oow/a ) 
h(t) = — 1 KiC -2, ° 001 + Hae- 1 - 0004 ' 6 
**C0 = f Q l (- 1 Kxe~ s,oooX + H 2 e- 1 - oooM W - X) dX 
f 0 f < a 

l /(« - a) OSaSI 

< < a 

/(< — X) dX Olagi 


19 (a) $(< -o) */(<)= { 
(b) u(t-a)*f(t) 


lx 


(c) <”> * f n = 


(m + n + 1)! 


Chapter 8 

sec. 8.2 
p. 293 

sec. 8.3 
p. 299 


7 (b) No 

1 y{x,t) - l A(l — |a: — a<|)[u(a: — alt + 1) — u(x — at — 1)] 

+ A(1 — \x + a<|)[u(a: + at + 1) — u(x + at — 1)] 

3 y(z,t) = M cos ( x — at)[u(x — at + ir/2) — u,(x — at — tt/2)] 

+ M cos (x + at)[u(x + at + jr/2) — u(x + at — vr/2)] 

6 y(x,t) — l A(x — at)e~< x ~ tt ‘)u(x — at) + (x — at)e*~ at u( —x + at) 

+ Yi(x + a0e _< * +o,) u(a: + at) + (a: + at)e x+a, u( —x — at) 

7 e(x) = — (1 + cos x) x < 0 

9 <j>(x) = (1/a) cos x[u{x + w) — u(x ~ tt)] 

11 If f" == 0, that is, if / is a linear function, then the given equation is 
identically satisfied without restriction on X. If / is an arbitrary, twice- 
differentiable, nonlinear function, substitution into the given equation 
yields (aX 2 + b\ + e)f" = 0, and this will be satisfied identically if and 
only if X is a root of the quadratic equation aX 2 + 6X + c = 0. Accord- 
ing as 6 2 — 4ac is greater than, equal to, or less than zero, this equation 
will have two, one, or no real roots, as asserted. 

13 x — at = a) x + at = cj 


ANSWERS TO ODD-NUMBERED EXERCISES 


779 


sec. 8.4 
p. 309 


16 The two-dimensional wave equation has solutions of the form 
u(x,y,t) = f(x — at) + F(x + at) + g(y — at) + G(y + at) 
and also of the form 

u(x,y,t) = f(x + y — y/2 at) + F(x — y — -\/2 at) 

+ g( -x + y — y/2 at) + <?( -x - y - \/~2 at) 
where /, F, g, and G are arbitrary, twiee-differentiable functions. 

1 J Q f(x) dx = 0; g(x) dx - 0. Physically speaking, the first condition 
implies that the integrated initial angular displacement is zero, which 
will always be the case if the origin of 6 is suitably chosen. Since the shaft 
is of uniform cross section, the second condition implies that 


jO* 


,0) = 0 


which is precisely the statement that the total angular momentum of the 
shaft is initially (and hence permanently) zero. In other words, 
g(x) dx = 0 implies that the vibration being studied is not super- 
posed on a uniform rotation. 

(a) Yes; (b) yes; (c) yes; (d) yes; (e) no; (f) yes 

n even 


, . V . . UtrX nirat . 

6 0(x,t) — ■"* sin — — cos — — where A 


8P 

7l 3 ir 3 


n odd 


Doubling the tension multiplies the frequency by \/ 2. Because it is 
easier to change the length quickly and accurately than it is to change 
the tension. 


, , V „ . nirx , nirat 

9 y(x,t) - 2, Bn sm ~ Sm ~T 


where 

11 y(x,t) = 
where 


n even 
n odd 


Jr^ 


„ . nirat\ 

B, sm — j 


- _iL i 

nVa 


13 u(x,t) = Y + 2 Ane ~ nHH ‘ aHi c ' 


J l 


, , ( 0 n even 

W iere " 1 — 4c.ua/nV n odd 

16 The normal modes of a uniform shaft of length l vibrating torsionally 
with its left end fixed and its right end free are given by the formula 

. (2a - l)irx , „ 0 

sin — n — 1, 2, 3, . . . 

and the corresponding natural frequencies are 41/ (2n — l)a. For a 
similar shaft of length 21 vibrating torsionally with both ends fixed, the 
normal modes and natural frequencies are, respectively, 


» 1, 2, 3, . . 


780 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 8.5 
p. 327 


Obviously, for every value of n, the rath natural frequency of the first 
shaft is the same as the natural frequency of order m — 2 n — 1 of the 
second shaft. Moreover the rath normal mode of the first shaft is clearly 
congruent to the portion of the (2ra — l)st normal mode of the second 
shaft which lies between 0 and l. The converse is not true, however; 
for neither the normal modes nor the natural frequencies of even order 
of the shaft of length 2Z correspond to possible free motions of the shaft 
of length l. 


3 2.03, z 2 = 4.91, z 3 = 7.98; Bi - 0.73, S 2 

6 u = ^ A n e~ l '* tlatl1 cos x j where 

nil ' ’ 


-0.15, B 3 = 0.06, 
200 sin Zn 


and the z’s are the roots of the equation cot z = az. 


7 xi m 70 + 


2 


where A n - 


z„ + sin z n cos Zn 


and the z’s are the roots of the equation cot z = az. 

11 In each case the natural frequencies are u„ = z„%/Z 2 , where z n is the rath 
one of the roots of the indicated equation: 

(a) Hinged-hinged: sin z = 0; X» = sin z n x/l 

(b) Fixed-fixed: cos z cosh z — 1 


- (sin z n — sinh z„) 


(“ 


l 


- cosh 


— (cos Zn — cosh Z, 


z»x\ 

l) 

’») ^si 


g«x \ 

l) 


(c) Free-free: cos z cosh s — 1 


X n = (sin Zn — sinh z„) ^cos ~ + cosh — ^ 

— (cos z n — cosh z n ) ^sin ~ + sinh 


(d) Fixed-hinged: tan z = tanh z 


X n — (sin z n — sinh z„) ^ — cos ~ + cosh 

+ (cos z n — cosh z n ) ^sin ~ — sinh 


(e) Free-hinged: tan z = tanh z 
X n = (sin z n + sinh 


linh z n ) ^cos ~ + cosh 

— (cos z» + cosh z n ) ^sin ~ + sinh 

13 Assuming the beam to be hinged at a: — —l and at x = l and to bear a 
mass M at x — 0, the frequency equation is 
sin z[2 cos z cosh z — fez(cosh z sin z — sinh z cos z)] = 0 

where z = -\/ u/a l and k is the ratio of the attached mass to the mass of 
the beam. The factor sin z yields the frequencies for the modes of vibra- 



answers to odd-numbered exercises 


781 


sec. 8.6 
p. 336 


tion in which the mass remains at rest and each half of the beam behaves 
as a simple hinged-hinged beam of length l. 

15 From Exercise 14, the normal modes are X n = sin (z n x/l), where the z’s 
satisfy the equation cot z = rz and r is the ratio of the moment of inertia 
of the attached disk to the moment of inertia of the entire shaft. From 
this it follows easily that XnX m dx = r sin z n sin z m 0, since 
sin z n sin z m — 0 only if z» or z m = fcw, and for these values cot z ^ rz. 

17 Since [* cos mx cos nx dx = I ° ™ f ”, the system is orthogonal 

J — 7T V 7T Tfl — ft 

on the interval (-ir,x). However, since sin x cos nx dx = 0 for all 
values of n, the system is not complete. 


L Yes. If the frequency of the impressed force is, say, co m = mva/l, then, 
'for the term C m sin (rmrx/l) sin ( rmrat/l ) in <j>(x) sin ( mva/l ), assume a 
term of the form D m Isin {mirx/l)]{t cos (mrrat/D) in the series of particu- 
lar integrals used in the second method. 




sin to t where 


If n < c /2ira, the time factor in the corresponding product solution is 
overdamped. If n > c/2ira, the time factor in the corresponding product 
solution is underdamped. If the string is acted upon by a forcing function 
sin ut, the natural assumption Y(x,t) = A(x) sin cot + B(x) cos cot 
leads to a pair of simultaneous differential equations for A(x) and B{x), 
and there is no simple extension of the concepts of magnification ratio 
and phase shift. If the second method illustrated in Example 1 is 
employed and a particular integral of the form 

(X s ‘ n ~T~) s * n s * n t) cos ^ 

is assumed, then for each value of n the corresponding term in the par- 
ticular integral can be constructed using the concepts of magnification 
ratio and phase shift. 

The analysis proceeds very much as in Example 1 except that the normal 
modes of the problem are cos (mrx/l) rather than sin (nrrx/l). Hence, 
4>(x) must be expressed as a half-range cosine expansion, and a similar 
assumption must be made for 8(x). 

~ag T sin (z - S) + sin £ _ sinh (z - g) + sinh $ 

5 [ sinz ~ Sinhz 


pA£ 3 I 


where z - y/co/a * and € = 
u = 1 j “ J™ e -x«i/o»/( s ) cos Xs cos Xa: ds d\ 


- 


- £) j sin 


cosh 2X — cos 2X 


£ sin cot cos — cosh ^2 — ^ X 
sinh ^2 - X - sin cot cos ^2 - X cosh - 
+ cos cot sin ^2 — X 


-t] 


782 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 8.7 
p. 343 


where X — aZ\/W 2. By applying this formula to each term in the 
Fourier expansion of an arbitrary periodic end condition, the steady- 
state temperature distribution produced by such an end condition can be 
determined. 

16 e(x,t) = J e~ pt sin Xa;[A (X) cos qt -f B(X) sin qt] d\ 

+ E 0 e~ ax cos (at + bx) 

where A(X) ~ ~ J Q ~ Et>e~ a * cos 6s cos Xs ds 

and B(X) ~ ~ j Q J^jB 0 we-«« sin 6s + p * A(r) sin rs dr J sin Xs ds 

17 u = ^ ^ Emne -U2 m+i )>+(zn+m*'tH a * sin ^ 2m A 1 irz'j sin *vj 

. ti < T 1 f 1 . /2m + l \ . f 2n + 1 \ 

where E mn = 4 f{x,y ) sm l — - — tx J sm I — - — try j dx dy 

19 u , = ^ A n sinh nx(l — y) sin mrx 


where A„ = - 


) sin nirx dx 


- . . - Cm s 

smh nir JO 

21 The problem can be solved by superposing the solutions to the problems 
defined by the following sets of boundary conditions: 
u(a:,0) ~ fi(x) u(x,l) — u(0,y ) = w(l ,y) = 0 

«(®,1) = fa(x) u(x, 0) = u(Q,y) = 14(1,2/) = 0 

w(0,y) = «(1,2 /) = m( 3,0) = w(a;,l) = 0 

14(1,2/) = f/a(2/) u(0,y) = w(x,0) = w(x,l) == 0 

23 (a) u = 100 (as should be obvious) 

(b) u — 100a: (as should be obvious) 

400 ,, M . (2 n - 1) %x 

c -(2n-l)ir w/2 g m 

(2 n - l)*r 2 

26 co» = a -\/ m? + n*/2Z (m, « = 1, 2, 3, . . .), where l is the length of the 
edge of the membrane. When the membrane is vibrating at a pure fre- 
quency the nodal curves are defined by the equation 

, . mirx . niry , , . nirx . miry 

A m „ sm — sm — + A nm sm — sin —j~ = 0 

where A mn and A nm are arbitrary. If either A mn or A nm is 0, the nodal 
curves are straight lines parallel to the edges of the membrane. In gen- 
eral, however, the nodal curves are too complicated to describe explicitly. 


to « - 1 


3 (a) e(x,t) 


e -a*x*l4t 



erf — —7= ) E'(\) dX + ( 1 - 

2Vv A 


1 ut + - 


ako 


erf— ~)2?(0) 

2 VO 

. nir x . nirat 

am sm __ 


(n*ir*a*/7*) — to* l n?r(co* — nVa 2 /£*) 

The second term, whose frequency is different from that of the impressed 
force, is present only because friction has been neglected. Actually, it will 



ANSWERS TO OD0-NUMSERED EXERCISES 


783 


7 


9 


die away rapidly in any realistic physical system. The response of the 
string to a distributed force f(x,t) = g(x)sin Uor F(x,t) = sin (n~x/l) h(t), 
where h(t) is periodic, can be found by first expanding g(x) or h(t), as the 
case may be, in a Fourier series and applying the result of the first part 
of the exercise to each term. 

y = _ 4. JL ( at _ x y u (at - x ) 


. . a smh (sz/a) a 1 .si 

£{0| = -=-= , ■ , , • When x = l this becomes — — • ~ tanh— ; 

EJ s cosh (si /a) E,J s a 

hence, in this case d is the Morse dot function of period 4i/a. 


Chapter 9 

sec. 9.1 
p. 350 


sec. 9.2 
p. 356 


sec. 9.4 
p. 365 


sec. 9.5 
p. 371 


1 (a) x = 0, regular; (b) x = 0, irregular; 

(c) x = 0, irregular; x = 1, regular 

(d) x = 1, regular; x — — 1, regular 

3 ~ (^+ * ' J ** <w 

A second series solution of this form cannot be found, since the indicial 
equation has equal roots. 

6 - z 1 ^1 — - + 222|1 . 5 - 23311 -5-9 + ' ' ** < “ 

Vt - V* + 2*217 • 11 “ 2*3 !7 • 11 • 15 + " ) * < °° 

7 The point at infinity is a regular singular point of the given differential 
equation. 

9 If the roots of the indicial equation differ by 1, the two roots lead to the 
same series unless 6i(r — 1) + ci = 0. Similar results hold if the roots 
differ by an integer greater than 1. For instance, if the roots differ by 2, 
the two roots lead to the same series unless 
| (r - l)(r - 2 ) + b B (r - 1 ) + c„ b^r - 2 ) + Ci I 
| bi(r — 1 ) + Ci bt(r — 2 ) + a j 

3 If xi were a common zero of J v (x) and J- V (x), then, from the result of 
Exercise 2, 

„ 2 . 

0 sin vir 

TXl 

which is impossible, since v is not an integer. 

1 y = s/ x [ Cl , i/c*+2) a; (m+ * ), *^ + C 2 Y i/ (ro+ 2) j 

3 y - x^lciJ 0 (2 Vi) + C 2 F o(2 Vi)l 

8 2 ! = ■s/xe~ x [ciJ3/ i ( l / ix 2 ) + CiJ-Yidix 1 )] __ _ 

9 2/ = cJ 0 (2 'V 3a:) + Cl Yo(2 V**) + c 3 /o(2 \/^) + ^K t (2 V 3x) 

* "“)'*> 

8 -x Jt(2x) + 2x 2 Jz(2x) 


784 


answers to odd-numbered exercises 


sec. 9.6 
p. 376 


sec. 9.7 
p. 386 


19 (a) Hix^Joix) sin a: — Ji(x) cos a:] + xJ x (x) sin x] + c 

(b) li {x*[Ji(x) cos x — Ja(x) sin a:] + 2xJ\(x) sin a:} + c 

23 2 y/x Ji(y/x) + c 

26 (a) xh(x)+ c (b) x*Ii(x) - x7 0 (x) + /JoC*) dx + c 

(c) x/ 0 (x) - JUx) dx + c (d) x*/ 2 (x) + c 


■ 3KM3K) 


J o(X„x) 


/(X)7( 5X) - J(5X)7(X) - 0 


- /i(X„x) 


3/ 0 (3X n )(l + X„*) 


(«* + a *) ?6 

6 (a) 1/X; (b) 0; 

7 e- Q< j* e«tf 0 (M) rfi 

9 (fl) (^jT* 

11 e"*J(f/2) 

17 (2 

V< 


(c) 1/X* 


(b) 


Vs 2 + ** (Vs* + X* + s)« 


(s* - X*)^ 
16 sin i - h/ 0 (f) 


cosh •%/ 2h/wk x 
cosh y/ 2 h/wk a 

the constant thickness of the fin. 

ffi(X \Zfl)J 0 (X -y/x) + Ji(X y/H)K 0 (\ y/x) 
K,(X V W(X Vr) + *i(* VWo(X \/r) 
where X = 2 \/ 2 h/kw. 

25 m = 2,400 cycles/sec; tu 2 = 5,600 cyeles/sec 
27 x - (1 + «<) {ft/jit^Xd + «<) % ] + c 2 /_^[HX(l + orf)**]} 
where c x and ci are determined by the equations 


21 u = Ua + (ti» — Wo) 
where 

23 w = Wo + (u c — wo) 


ciJ-aiHX) + CiJ-nOi*) - xo 
ci/_j^(%X) + — 0 

29 Using the first suggested method, the critical lengths are determi ned by 
the roots of the equation J-iiCHafi*) — 0 where a = y/ ' Ap/EI. Using 
the second suggested method, the critical lengths are determined by the 
roots of the equation 

cos y/s cel + }ie~ atli — 0 where a = 



The first critical lengths in the respective cases are given by the formulas 


31 y =* F o tan 




and 


l = 


2.024 


s Ie 

\A P 


Vx/ 0 (2ffl Vx) l 
aJi(2a y/l) J 



ANSWERS TO ODD-NUMBERED EXERCISES 


33 

36 

37 

39 


41 


78 S 


J,i ("*) J - H (?) ~ J,i (?) (t) “ 0 

„ heK a’-^(S-=Yj- 

The natural frequencies are the values of w determined by the equation 

'■(*)- 0 


where b is the radius of the drumhead. 

Ji(2 -\/Zal)I 2 (2 \/“^) - 2 V^)/ l (2 \/«ai) = 0 

where a 2 = 4p /EgW. 

u(r,0,z) - II w nm cos nB + B nm sin n0) sinh (b nm s)J n (\ nm r) 
where J n (X nm 5) = 0 and 

2 f* rG n (r)J n (K m r) dr 
b* sinh (X„ m /t)J“ +1 (X„ m 6) 

2 j Q b rHn(r)Jn(b nm r) dr 


A nm = ; 


(7„(r) = - f 2ir j(r,0) cos nB d 
v JO 


B nm 
u(r,t) » 100 


i> 2 sinh (X nm h)/? + i(X„ m i>) 
In r — In r 2 


7/„(r) » - f 2T f(r,0) sin n0 do 
T JO 


In ri — In r 2 

+ £ e~* n * t,a *A n [ y 0 ( X„r j ) / o(X»r ) - Jo(X»n)7o(X»r)] 


where X» is the nth one of the roots of the equation 

7„(Xn)/ o(Xr 2 ) - Jo(Xn)T 0 (Xr a ) - 0 

and 


- r[Y 0 (X„ri)J o(X„r) - /oOwOFoPwOl 100 dr 

~ [7 0 (X„r 2 )/ 1 (X»r s ) - / 0 (X„r 8 )7 1 (X»r 2 )l 2 

- ~ [Fo(Xnn)/i(X„ri) - J o(X„ri)Fi(X n ri)] a 


43 u(r,z) = V /l«J 0 (X B r) sinh (X» 2 ), where X„ is the nth one of the roots of 
»»i 

the equation cJo(X6) — X/i(X6) = 0 and 


2 f Q b rf(r)JoMdr 
A " = 6 2 sinh (X„/i)[l + (c/X„) 2 ]/ 0 2 (X»6) 

and c (instead of A) has been used for the parameter in the radiation law 
to avoid confusion with the height h of the cylinder. 


200 , 200 
= — e + — 


I, <->•(£)“ 


46 u(r,d) 


786 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 9.8 
p. 398 


Chapter 10 

sec. 10.1 

p. 412 


sec. 10.2 
p. 428 


! (b) ^ +2 PM ^_ 3Pr(x) + 2PM 

6 P n w( 0) . ^§1 ( “ (2fc + 1)/2> \ k, n both even or both odd 

2 fc fc! \ (n — k)/ 2 / 

7 u = ^ 04r“ + B /r n+l ) P„(eos 0), where .4 and B are determined by 
the equations 

,46i n + j* fi(6) sin 0 P*(cos 0) do 

Abi n + ~i = J Q /a(0) sin 0 P„(cos 0) dd 

. „ , . f 2n 2 Mn - 2) "I 

9 P nl (z) = oi 1 - — I 

„ , , r 2(« — 1) 2 2 (n - 1)(» - 3) . 1 

HAx) -«.[*- 3I + « J 

The usual definitions are obtained by choosing m and as so that the coef- 
ficient of the highest power of x in each case is 2". The orthogonality of 
the H’s follows from the fact that the given differential equation can be 
written in the Sturm-Liouville form 


1 (a) 80; (b) 0; (c) 4 

7 D n - aDn-i - 6cD n _ 2 ; if a = 3, b « 2, c - 1 , then D„ = 2»+» - 1 . 


9 n («»• - 


16 Yes 


1 The product of the given matrices is 
3 For X = 

For X = 


3 7 
13 3 


1 2 I 

2 —1 1 

the given expressions each yield 

1 6 

I -10 

-10 

16 

1 2 0 


2 -2 

0 

0 3 0 

| the given expressions each yield 

0 0 

0 

0 0 4 


0 0 

2 


The given relation is an identity for all square matrices X. If X is not a 
square matrix the given expressions are meaningless. 

7 ABj, where Bj is the jth column vector of B. 

16 No; the individual submatrices must also be transposed; that is, 


= Bo^H 


17 The sum of the elements in any row, say the ith, is the sum of the proba- 
bilities that the system goes from the ith state to some state. Since this is 
certain, the sum of these probabilities must be 1. 



ANSWERS TO ODD-NUMBERED EXERCISES 


787 


x ( 4 3 1 II , 28 27 9 II 

19 M = - 3 4 1 M 3 = — 27 28 9 

8 1 3 3 2 II 82 27 27 10 1| 

t 1 220 219 73 1| , 1,756 1,755 58511 

M 3 = ~ 219 220 73 M 4 = - 1,755 1,756 585 
8 ' I 219 219 74 1| 8 1,755 1,755 586 1| 

sec. 1 0.3 3 (a) If A is nonsingular. 

p. 436 6 The inverse of an upper (lower) triangular matrix is also upper (lower) 

triangular, and the elements on the principal diagonal of the inverse are 
the reciprocals of the corresponding elements in the original matrix. 

II 1 3 1 II 

7 The inverse of the coefficient matrix is —3 —1 5 . Multi- 

II 2-2 2 1| 

plying the matric form of the given equation by this matrix gives 

I 10 II 

x - H 10 . 

I Ml 

9 Hint: Verify that (adj J 4) -1 = A/\A\. Then, in the relation adj B = 
|J3| B~\ let B — adj A and use the result of Exercise 8. 

13 \K\ = -kn(ki + kt)(kn + ku) ~ kMk l2 + *«) - k u k M (ki + k a ), and, 

if all the k’a are positive, this is obviously a negative quantity. 

L 3 2 5 8 II 

15 A’ -1 =» elasticity matrix = — — 5 16 28 

162EI g 2g 54 || 

162 El 80 ~ 46 12 1| 

K = stiffness matrix = — — —46 44 “16 

L 12 -16 7 1| 

In this problem it is the elasticity matrix which is computed directly 
and the stiffness matrix which is computed as the inverse of the elasticity 
matrix. In the example in the text it was the stiffness matrix which was 
computed directly and the elasticity matrix which was computed as the 
inverse of the stiffness matrix. 

sec. 10.4 1 No, because if all minors of order p < r are zero, then, by expanding the 

p. 443 minors of order r in terms of their p th minors, we see that they too must 

vanish, contrary to hypothesis. 

(3 X ^ M, 1,2 
3 (a) Rank = <2 X = 1,2 

( 1 X - H 

, f 3 X 1,6 

(b) Rank - \ 2 x = 16 

( 3 X l, l % 

(c) Rank = \ 2 X = i % 

( 1 X = 1 

7 (a) T\\ column 2 — 2 column 1; T 2 : column 3 — column 2; 

T 3 : column 1+2 column 2; TV column 1 — column 3; 

T 6 : column 2 — column 3; TV — column 2. 

(b) 7+. row 1 — row 3; TV row 3 — row 1; TV row 3 — row 2; 

TV row 1 + row 3; TV row 2 + row 3; TV — row 3. 

II 0-1 Ml 

P-i = -1 0 2 

II 1 1 —2 [| 

9 A and B are equivalent since each is of rank 3. Hence in particular we 


can take P = B and Q — A~ l . 




788 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 10.5 
p. 459 


sec. 10.6 

p. 465 


Chapter 11 

sec. 11.1 
p. 476 


5 If A is an (n,ri) matrix of rank r, then AX — 0 has n — r linearly inde- 
pendent solution vectors, and so does A T X = 0. If A is an (m,n) matrix 
of rank r, then AX — 0 has n — r linearly independent solution vectors, 
but A T X = 0 has m — r linearly independent solution vectors. 

7 3Xi + 4Xs — 3 X 3 + X 4 — 0, and from this each vector can immedi- 
ately be expressed in terms of the other three. 


5 


-4 


1 


1 

-6 


3 


1 


1 

2 

+ a 

-1 

+ 

0 

II 

0 

1 


0 


0 


0 

0 


1 


0 


0 


11 (a) xi — —k, x% = k, xt <= k 

(b) xi — 17 k, x 2 = — 15fc, x 3 => 13fc, Xt, = 20 k 

II 1 1 II 

13 No; for instance, consider the matrix 1 2 . Its rows are linearly 

|| 0 0 || 

dependent, but its columns are linearly independent. 

17 Any (p,q) matrix of rank r can be written as the product of a (p,r) matrix 
and an (r,q) matrix. 

23 Note that A P+1 X = 0 implies A* + *X = A^X - • • • - 0. Now 
consider the possibility that ciX + c».4X + • • • + c p A p X — 0. 
Multiplying on the left by A p gives 
ciA p X + c 2 A p+I X + • • - + c P A^X = 0 

which implies a — 0. Similarly, it follows that C 2 = c 3 = • • • = c p — 0, 
which proves the given vectors are linearly independent. 

26 Hint: Use the result of Exercise 23. 


1 X 


3 X 


6 X 


7 X 



1 (a) Indefinite; (b) positive-definite; (c) negative-definite; 

(d) positive-semidefinite 

5 (a) Minimum at (0,-3); (b) critical point at (—2,0,0), which is 

neither a maximum nor a minimum; (c) maximum at (3,2,3); 
(d) minimum at (1,0); ( — 1,0) is a critical point which is neither a 
maximum nor a minimum; (e) minimum at (1,1); (0,0) is a critical 
point which is neither a maximum nor a minimum; (f) minima at 
(Mir + 2 nir, + 2mw), maxima at + 2ri7r, + 2nnr) and at 

(Mtt + 2mr, %ic -f 2 «t) 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 11.2 
p. 491 


sec. 11.3 

p. 503 


v*n 


Vs] 0 




1 (a) Xi = 1, X 2 = 2 (repeated, with a single independent characteristic 

vector); Xi = JJ 1 ||, X% — II 1 1 


(b) \i « 1 (repeated, with two independent characteristic vectors), 

-°|U, 


X 2 = 2; (Xi)i = 3 , (X l ) 2 


(c) X, = 0, X 2 = 1, X 3 = 2; Xx 


(d) Xi = -1, X 2 = 1, X, =4;Xi - 


1 * 2 


2 , X 2 = 
-7 I 


*■ - II 1 
2 


(e) Xi = 1 (repeated, with a single independent characteristic vector) , 


= 6; X\ 


iU. 


(f) Xj = 1 (repeated, with two independent characteristic vectors), 
(X,h - ||l ||, (Xi), - ||o| X ‘ • 

3 (a) Xi = 1 (repeated, with two independent characteristic vectors), 


X 2 - 2; (Xi ), 


2\/2 


II Ml 

0 , (Xi) 2i 


6\/2 




(b) Xi — 1 (repeated, with two independent characteristic vectors), 


X 2 = 2; (Xi)i 


1 1 

'HI, (Xxh = H - 2 , X t = H 
0 3 


9 If the characteristic equation vanishes identically. Example: 


1 2 3 
1 1 1 
1 1 1 


16 All characteristic vectors of the symmetric matrix Jj 
proportional to II II- 


1 (a) P,Q - 

(b) P,Q - 

(c) P,Q = 


1 0 
-3 1 1| 

1 0 II ! 

0 1 II' || 0 
1 0 
-2 1 
H -H 


’ II 0 1 

11 


1 0 

3 1 


3 -1 

■2 1 | 

■II? II 
1 -1 
1 0 


0 




1 -1 
1 0 
0 1 


790 


ANSWERS TO ODD-NUMBERED EXERCISES 


3 (a) 
(b) 
<c) 

(d) 

(e) 

(f) 

(g) 



10 0 1 
0 10,1 
2 3 1 I 0 


i/\/3 2/Ve 
1 /V 3 -l /\/6 

■i/Vo i/Vl 

-2/Ve 17 V 9 
H 3/Vl2 
-M l/V 12 \\ || y. 


H VQ H H \/s 
i/Ve o -1/V3 
i/Ve -H H V 3 

H -H 
K H -H 
H H H 

11° 1 1 
H i o -l 

i-i o 


H\/s 

-H V21 

H 

1/1 

-H Vs 

-H V21 

H • 

Vi 

0 

H V21 

H 

2/3 





k 


l 



ANSWERS TO ODD-NUMBERED EXERCISES 


791 


sec. 11.4 
p. 515 


sec. 11.5 

p. 524 


sec. 1 1 .6 
p. 531 


5 Yes 

9 For all values of a and b, the given equations are satisfied, respectively, 
by the following matrices: 

a —b 


(a) Z = 

(c) X - 
15 (a) Z, = 


a 2 — 2a — 3 


b 
a 

- 4a — 5 


(c) X, 
17 (a) 

(d) 


/si 


-4 3 
-2 1 


0 2 
-1 3 1 ] 

60 19 II 

-57 -16 1| 

-12 -27 -9 

3 12 3 


2 — a 
-b 
4 — a 

Z 2 - 
Z 3 = 
z 2 - 
Z 3 - 


(b) Z- o*-4o + 3 




4 — a 


a 

ab 

0 

0 

2 a 

0 

-2 

2b 

3a 


i -II 

-10 9 1 

-6 5 


Z 4 = 
2 -1 


1 :? 


(d) Xt 

- -3 

4 


II 3 

-3 

... ||-3 -6 || 

b 1 9 12 1 

(«) || 

0 - 
3 

II II 6 

35 

35 || 

ii (e) 3 

76 

73 

II II -3 

-35 - 

•32 || 


3 (a) A 2 - /l - 0; (b) 4 2 - / = 0; 

(d) A 2 - 44 +3/ - 0 


B (a) 
(d) 


35 

A 3 - 6 A 2 + 65.4 


(b) 


(c) 


(c) A 2 — 34 + 2/ - 0; 
4 s + 34 


10 


(e) 


3 a-* = 


2e _1 — e~ 2 2 e _1 — 2e~ 8 

| -e- 1 + e~ 2 -e" 1 + 2e~ 8 

5 By the Cayley-Hamilton theorem, 


( 112358 13 \ 

1+ 2i + 3i + Si + ii+g!+?i + 8i + ' ) 

( 1 2 3 5 8 13 21 \ 

1+ 2> + 3i + 4i+5i + S + fT + iT + ' •) 

By Sylvester’s identity, 

eA = 2 yji sinh A + '\H (y ^ cosh ~1r ~ sinh ~*y) 1 

9 The given matrices commute if and only if y = 2% — 1. When the 
matrices commute, 

sin A =4 sin 1 cos 4 = —4 -f 1(1 + cos 1) 

sin B = B sin 1 cos B — I cos 1 



792 


ANSWERS TO ODD-NUMBERED EXERCISES 


Chapter 12 

sec. 12.1 
p. 543 


sec. 12.2 
p. 549 


sec. 12.3 
p. 558 


sec. 12.4 
p. 570 


. n , rn , , , r,s sin 1 + sin 2 r 2 8UX 1 - tan 2 
o 3 

sin {A — B) — (A — B) sin 1 

cos (A + B) = I cos 1 

cos C A - B) = {A — J3)(cos 1 - 1) + J 

and, using these, the given identities can easily be verified. 


1 (a) 3, 9, 13, -18, lOi +;i8j + 16k, -% 3 , 131°49', -90, 

2451 + 210j - 170k, 220, 0 

(b) 7, 15, 11, 80, 72i + 24j - 12k, - 6 Hs, 40°22', -636, 

— 210i + 710j + 425k, -1,272, 0 

(c) 15, 15, 9, -40, -75i + 60j + 30k, «%, ey S) 96°7', 915, 

610i + lOOj - 1,420k, 1,830, 0 

7 Not necessarily; not necessarily; not necessarily; yes 
9 (17i - 13j + 8k)/\/522 

11 No. In fact, if A « 0 and B — C 0, then AXB=BXC = 

C X A - 0, but A + B + C - 2C =* 0. 

nn -17A + 14B+3C -11U + 319V + 143W 

23 l + 2j + 3k = — 


_ 40i + 45j - 100k 


. 21 + 27 j + 28k 


24i — 6j + 6k 


25 No, because F X R is opposite to the direction in which F would cause a 
right-hand screw to advance. 

3 (a) 

at at 3 

_ dU /dU d«U\ (dJJ d 3 U\ 

(b > * X U X ^) +UX U 1 **) 

7 Yes; the tangents at t = 0 and t = 1 are parallel. Yes; for all values of t 
the tangents at t and at —t are parallel. 

9 TO-(| + 0i+(l+^)j + (5-i+2)k 


Vu 

1 3 

3 (a) yz + 3a; 1 + 2 xz — y 3 ) 
5 ±M(2i - 2j - k) 


-Hi - 8j + 9k 
\/266 


s / 19 


(b) —2yzi + ( xy - z 3 ) j + (6 xy - xz)k 
16 n = —3 


3 *m 6 / 4 5 (a) 0; 

7 3 9 xa e / 24 

11 The common value of the integrals is %. 


(b) ~ S A 5 


ANSWERS TO ODD-NUMBERED EXERCISES 


793 


sec. 12.5 
p. 583 


sec. 12.6 
p. 593 


Chapter 13 

sec. 13.2 
p. 604 


13 (k/ 2)(*i* + - Cr 0 2 + y 0 2 ) ln+ttli 

17 Yes, provided the entire boundary is consistently traversed in the posi- 
tive sense. 

i. [f(*!V-?l a J) d , dy 

JjR\dxdy dydxj 

1 (a) Hi 0») 'Ml (C) 'Ml (d) 'M 

3 (a) 1; (b) 6; (c) w/2; (d) r/3 

6 ///4“(S + S + S)-(S+S+S)]" 

, /«Fi a^A o , /sf 2 «f*\ 1 

\dz dx ) \dx dy J J 

9 The length of the curve. No, because the vector T is defined only on the 
curve <7. 

16 0 17 f c }ir*dR 

19 The common value of the integrals is 5ira*/4. 

21 The common value of the integrals is 
23 -2 it 

29 If 0 is a point at which the surface S has a tangent plane, the integral in 
Gauss’ theorem is equal to 2rr. If C is a singular point on S, the integral 
may have any value between 0 and 4ir. 


1 — r 8 /3; —In r 3 (\/ a* + z 1 — z) 

a 2 



the summations extending only over the odd positive integers 


(a) 


i - 2j + k ■. 

-i - j + 2k 


2i + 5j — 4k 

- 

e B 

3 

3 

63 = 

3 


§1 = 

21 + k e 2 = 

1 + 2j + 3k 

e 3 = 

1 + j + k 

(b) 

s t = 

21 — j — k e 2 = 

31 - 2 j - k 

e a = 

— 7i + 5j + 3k 

(b) 

gl * 

1 + 2j - k e 2 => 
U T Q~ l V 

2i + j + 3k 

6 3 = 

1+ j+ k 

cos 0 





794 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec 13.3 

p. 618 


sec. 13.4 

p. 624 

sec. 13.5 

p. 628 


1 


7 


9 


(a) fix i) Axi + f(x t ) Ax 2 + /(x 3 ) Ax a 

(b) anXiXi + anXiXi + anXiXa (c) 
+ 0212:22:1 + 0222:22:2 + 023X2X3 

+ 031X3X1 + 032X3X2 + 033X3X3 


auxiyi + anxiy?, + auXiy* 
+ auxiyi + 022X21/2 + 023X22/3 
+ 031X32/1 + 0 32X 32/2 + 0 33X 32/3 


(d) 

(e) 

(f) 


(g) 


(a) 

0» 


OuXiXi + 022X2X2 + 033X3X3 

(aix 1 + o 2 x s + o 3 x 3 ) 2 — (oix 1 ) 2 + (a 2 x 2 ) 2 + (o 3 x 3 ) 3 

+ 2aia2X I x 2 + 2aia 3 x 1 x 3 + 2a 2 a 3 x 2 x 3 
fix 1 dy 1 dx 1 dy 2 dx 1 dy 3 

— 2, -1- Z\ -I — z, 

dy' dx k dy 2 dx k dy 3 dx k 

dx 2 dy 1 t dx 2 dy 2 3x 2 dy 3 

dx* dy* dx 3 dy 2 dx 3 dy 3 

dy 1 dx k 3 dy 1 dx k 3 dy 3 dx k 23 

— z i 4. dx * dy * z i _l ^ 2 i 
di/ 1 fix 1 dy 2 dx 1 3j/ 3 dx 1 

dy'dx 2 ~ dy 2 dx 2 dy 3 dx 2 

4. ££! Z 3 ■ dx* dy 2 ^ , 6x* dy 3 ^ 

+ dy 1 dx 3 2 + 3j/ 2 9x 3 dy 3 dx 3 2 

At each of the given points the lengths of the base vectors are 
ei = 1, e 2 = 2, e 3 = 1. 

At each of the given points the lengths of the reciprocal base vectors 
are e 1 = 1, e 2 = }$, e 3 = 1. 


At (2,0,1), V - -ej + V 2 V% e2. At (2, |> 1^, V = e, + H 


The length of V is 2. 


6 (a) No. (b) No. 


3 (a) For a contravariant vector V — if'e,, the divergence is 

1 f d(r 2 sin 0 I 1 ) d(r 2 sin 9 £ 2 ) djr 2 sin 6 £ 3 ) 1 

r 2 sin 6 dr dd d</> J 

If we let V 1 , V 2 , V 3 be the components of V along unit vectors in 
the directions of ei, e 2 , and ea, respectively, so that F 1 = I 1 , 
V 2 = rif 2 , V s = r sin 0 if 3 , the divergence appears in the more usual 
form 


1 djr 2 V *) [ 1 6 (sin 0 V 2 ) | 1 dV 3 

r 2 dr r sin 0 dd r sin 6 3</> 

(b) For a covariant vector V = £»e*, the divergence is 

1 T djr 2 sin 0 £1) d (sin 0 fa) 3[(l/sin 9) gal l 
r 2 sin O [ dr dO ‘ d<t> J 

If we let Fi, F 2 , F3 be the components of V along unit vectors in the 
directions of e 1 , e 2 , and e 3 , respectively, so that 


Fi - £, 


the divergence appears in the more usual form 
l d(r 2 V ,) 1 disind Vt) 


■ + 


1 dV* 
r sin 8 d<t> 





Chopfer 14 

sec. 14.2 
p. 635 


sec. 14.3 
p. 641 


sec. 14.4 
p. 644 


sec. 14.5 
p. 649 


ANSWERS TO ODD-NUMBERED EXERCISES 


-19 - 22 ? 

H (~ 2 -|- 2 i) 
x = 1, y = 2; x = 


7 7 ? 

13 1 + ?; no 


4, y = }{ 


1 

3 cos 


(a) Rotation through —90°; (b) rotation through 45° 

r tt _ -y/2 + ? V 2 
4 ‘ * ^ 4 2 

. 3*- _ - V2 + i \ X 2 


Oir 


5ir 


+ *' s; 
- + ?s 


v/2 - i \Z% 
2 

\/2 — i -y/i 
2 

= V* + i 
_ - yii + % 


9r 


9ir 


2h>(cos 15° + ? si 
2W(cos 135° + * si 
2Hi(cos 255° + 

2?“ (cos 180° + i si 
2?* (cos 108° + 
2?Kcos 36° + 

2% (cos 324° + 
2^(cos 252° + 


15°) = 1.084 + 0.291? 
in 135°) = -0.794 + 0.794?: 
in 255°) = -0.291 - 1.084? 
in 180°) = -1.320 
in 108°) - -0.408 + 1.255? 

i°) = 1.068 + 0.776? 
in 324°) = 1.068 - 0.776? 
in 252°) - -0.408 - 1.255? 


13 At the point 


ni\Zi +■ ?n 2 zi + ?n 3 « 3 


1 No 

3 If and only if z\ and z 2 have the same argument (or arguments differing 
by an integral multiple of 2n) 

5 If and only if y — ±x 
7 The ?/-axis; the point (1,0); there is no locus. 

9 The set of all points on and within the parabola y* ~ 2x — 1 


-2 - 3? 


3 — ?j 3 2 + 2s - 


(a) Unbounded, open, simply connected 

(b) Bounded, closed, multiply connected 

(c) Unbounded, open, multiply connected 

(d) Unbounded, closed, simply connected 

(e) Unbounded, neither open nor closed, simply connected 

(f) Bounded, closed, simply connected 

Along the parabolic paths y — nix 2 the function approaches the respec- 
tive limits — — — ; hence, lim — X ^ ■ does not exist. 

1 + m 2 z— >o + y 



796 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 14.6 
p. 656 


sec. 14.7 
p. 663 


sec. 14.8 
p. 674 


1 At t — — *• 1, ±i 

3 Only at the origin; only at the origin; nowhere 

5 If and only if u and v are constant 

„ , 1 — m a 2m 

7 The values all he on the circle x — - — -> y = — -> i.e., the 

1 + to * 1 

circle x 2 + y* = 1. 

7 (1 + i) 1-< = e( ln \/2+*/4+2™r)+*(- In V2 +t/ 4+2 n«-) } an{ j the arguments 

of the different values differ only by multiples of 2rr. 

9 1 13 Yes 

16 z = ^ + 2nv +■» cosh -1 3 17 z = In 2 + (2n + l)iri 

19 Since the complex numbers are not ordered, the inequalities appearing 
in Rolle’s theorem are meaningless for complex variables. 

23 2 

1 Along each path the integral is equal to 6 + 26 j'/3. 

3 Along each path the integral is equal to li (— 1 + 5i). 

6 On the path |z| = 3 the absolute value of the integral is dominated by 
i^e 8 . (This is a very crude estimate, since by the methods of Chap. 16 
the exact value of the integral can be shown to be sin 2.) On the path 
|z| = }i the integral, by Cauchy’s theorem, is 0. 

7 (a) 0; (b) H(3 + 2*>; (c) K(-3 + 2t> 

9 (a) -3w/2; (b) 3iV/2; (c) 0 


Chapter 15 

sec. 15.1 

p. 685 


see. 15.2 
p. 691 


1 

3 

9 


The interior of the circle of radius 1 and center i 
The parabola y 2 = — 1 ~ 2x and its interior 

Yes; for instance, for all values of x, the series of continuous fractions 


x 

1 + X 2 


r * 1 

, [ 2s 

3x 1 

[l+z* 1 + (2a:) 2 J 

[l + (2*)* 

1 + (3a;) 2 J 


+ • • • 


converges to the continuous function 0, although in the neighborhood of 
the origin the series does not converge uniformly. In fact, |J? B (*)| = 

- — ~~ — - < < implies that n must satisfy a requirement of the form 
1 1 + (m ) 2 1 
n>f(e)/\x\. 


1 (a) /(z) = -1 + 2z - 2z* + 2z 3 - 


1*1 < 1 

(z - 1)* , 


|z — 1| < 2 

(a) /(z) - | - Kz* + 7 A* 3 - + • • • W < 1 

w M - 0 - 0 - (p - 5 ) - 2) + (? - 5>) <■ - 2)! 

7 2 a/2 


6 V2 


I* “ 2| < 3 



sec. 15.3 

p. 697 


Cbopfer 16 

sec. 16.1 
p. 703 


sec. 16.2 

p. 709 


ANSWERS TO ODO-NUMiERSO EXERCISES 


1 (a) Rz) -H + %z + Mz 2 + 'Hzz* + ■ ■ • 

J. 1 _ 1 1 z z 2 z 3 

7~ z 2 ~~z~ 2 ~ 4 ~ 8 ~ 16 ‘ 


(b) /(z) = 


(c) Rz) = • • • + ^f +-,+~+\ 
z 6 z l z s z 2 


(A) Rz) - 


- 1 ~ (z - 1) - (z - ip - . 


(z - 1)< (z - 1)» (z - l) 2 


(E) Rz) - 
3 /(z) - - - 

Rz) - • • 


1 


1 


(z — 2) 1 (z — 2) 3 (z — 2) a 
- 2t + 3(z - i) + 4 i{z - i) 2 - 6(z - i) 3 - • • ■ 

0 < |z - i\ < 1 
| z - i\> l 


2i 


(z - z) r - (z - i) 4 (z - <)* 

6 (a) 0; (b) 2ri; (c) 0; (d) 0; (e) 2ri; (f) w/6 

7 The argument is invalid because there is no value of z for which both 
series converge; that is, there is no value of z for which the two series are 
simultaneously valid. 


1 /"2x 

9 (c) a» = - ~ Jo cos ' 1 ^ cos ^ cos 

(d) a„ = ~ j** cos (2 cos 0) 


nO do 
cos nO dO 


1 (a) (b) M 

3 At z = — 1 + 2i the residue is 34(2 + i); at z = — 1 — 2i the residue is 
34(2 - i). 

6-1 7 Ko 

9 (a) 2 «; (b) 0; (c) 0; (d) 0; (e) 0; 

(f) H(5+2V~5)ir 


1 

6 

9 

13 

17 

21 

23 



ir/3 


?r ( z' hm r"\ 
a 2 - 6 2 \ 6 a / 


{ (62 + c 2)(«-i)/2 sin (a 


* 


. .1+2 cos - j 

3 sm ar \ 3 / 


s 2?ra> \ 


[“ 


3 2T 


\/ 2 a 3 
11 ir/4a 3 

16 7r(l + a?n)e~ am /4a 3 
19 ^ e -WV2 s i n _HL 

V2 

_!) tan-f^]} 




798 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 16.3 
p. 716 


sec. 16.4 

p. 728 


Chapter 17 

sec. 17.1 
p. 732 


sec. 17.2 
p. 737 


see. 17.3 
p. 747 


(— i) n 2 T; (i ,A> 

a n * — 7 - = tan ( - esc" 1 - ) 

vV-&« L V 2 vJ 

As n — » oo , ct„ approaches zero more rapidly than the reciprocal of any 
fixed power of n. This is, of course, implied by Theorem 3, Sec. 6.3, 
since all derivatives of the given function are everywhere continuous. 


1 M(e~‘ - «-*) 

6 1 — cos t 

9 e~ 2l (2 -t)~ 2e~ 3t 


3 H sin 21 
7 t sin (2t/4) 


11 


m) 


To [ 8Z V sin [(2n -j- l)irx/2l] cos [(2n + Y)irat/2l] 
(2n + l) 2 


13 

16 



where X> is the jth root of the equation J o(X) 


0. 


1 Dj = 0, D» = —9, P 3 * —81; therefore, there is at least one root with 
nonnegative real part. (Actually the roots are —1.92, 0.86 ± 1.94-i.) 

3 Dt = 2, D 2 = 10, Z> 3 = 0, D t ~ 0; therefore, there is at least one root 
with nohnegative real part. (Actually the roots are ± is/ 2, —1 ± 2 i.) 


3 The transformation w = /(z) is equivalent to the transformation w = 
/(z) followed by a reflection in the real axis. 

6 Angles are not preserved. 

7 The equations of the transformation are 
u ~ x k — 6 x 3 y 3 -\-y k v = 4a: 3 y — 4a ;y 3 
The image of x = 1 is 

v* - 224uu s - 256m 3 - 2,176a 2 - 1,792m 2 - 2,0S4u + 4,096 = 0 
9 The equations of the transformation are 

1 z* + y 2 ° x* + y* 

1 (a) v/a; (b) */>/£ 

3 (a) z - ± 1; (b) |z 2 - 1| - Hi 

(c) a; 2 — 2xy — y 2 = 1; (d) a: 2 — y 2 = 1 

6 The images of the perpendicular lines a: = 1 and y — 0 intersect at an 
angle of 45°. 

1 -1 

3 If (d — a) 2 + 46c = 0, the transformation leaves a single point invariant. 

At least one point must be left invariant by any bilinear transformation. 



ANSWERS TO ODD-NUMBERED EXERCISES 


799 


sec. 17.4 

p. 753 


5 

7 


17 


iz - 2 

- — — ; 3 (u v - + u 2 ) + 8 u + 2b+3-0 

2 + 2 

— — i-( where a, b, c, d are all real and ad ~ be < 0 
ez + d 



3 3 + i 


3 xhj — y 3 
a; 3 — 3.ri/ 2 



1 io = — z 2 3 w = ■%/ 2 5 io = -%/ 2 2 — 1 

7 In the half plane onto which the given region is mapped by the mapping 
function w = s/ z 2 — 1 + cosh -1 z, the temperature is 



This cannot be explicitly transformed back into an expression for the 
temperature in the original region in the w-plane. 

9 io = iir + J '<2 In - 

2 — 1 


r, = 3.325; r u ,r 3 = 1.338 ± 0.6327 

n = 3.732; r 2 = -2.018; r 3 = -0.382; r., = 0.268 

r, = 1.107; r 2 = -0.838; ?> = r.» = 0.500; n = -0.270 

r, = 4.966; r» = 2.450; r 3) ri = 0.612 ± 1.1297; r 6 = -0.640 


Appendix 

p. 764 


1 



798 


ANSWERS TO ODD-NUMBERED EXERCISES 


sec. 16.3 
p. 716 


sec. 16.4 

p. 728 


Chapter 17 

sec. 17.1 
p. 732 


sec. 17.2 
p. 737 


sec. 17.3 
p. 747 


As n —> «s , a n approaches zero more rapidly than the reciprocal of any 
fixed power of n. This is, of course, implied by Theorem 3, Sec. 6,3, 
since all derivatives of the given function are everywhere continuous. 


1 

5 

9 

11 

13 

15 


M(e“‘ ~ «-») 

1 — cos i 


e-“(2 ~ t)~ 2e~ 3 ‘ 



3 H sin 2 1 
7 t sin (2£/4) 

sin [(2n + l)vx/2l] cos [(2n + l)ir<rf/21] 
_____ 


m - 1 + 


m 


4a - (-l)" 4 ' 1 

v n 

^ 2 JoCS/r)e-h t 


mrt 

“ 2a 


IjJiiXj) 

where X/ is the jth root of the equation J o(X) = ( 


1 D\ « 0, £>» = — 9, Z >3 — — 81; therefore, there is at least one root with 
nonnegative real part. (Actually the roots are —1.92, 0.86 ± 1.942'.) 

3 Zb = 2, Di = 10, Da *> 0, Dt = 0; therefore, there is at least one root 
with nonnegative real part. (Actually the roots are ± is/ 2, — 1 ± 2t.) 


3 The transformation w — /(2) is equivalent to the transformation w — 
f(z) followed by a reflection in the real axis. 

6 Angles are not preserved. 

7 The equations of the transformation are 

u » x* — 6x 2 y 2 + y i » = 4a: 3 * * * * * 9 j/ — 4a:j/ 3 

The image of i = 1 is 

a 4 - 224ui> 2 - 256m 3 - 2,176a 2 - 1,792m 2 - 2,08422 + 4,096 =* 0 

9 The equations of the transformation are 



1 (a) it /a; (b) v/y/ 2 

3 (a) z - ± 1; (b) |z 2 - It * H', 

(c) x 2 — 2xy — j/ 2 = 1; (d) ** - y 2 = 1 

6 The images of the perpendicular lines x = 1 and y = 0 intersect at an 
angle of 45°. 

1 ~1 

3 If (d — a) 2 + 46c = 0, the transformation leaves a single point invariant. 
At least one point must be left invariant by any bilinear transformation. 



ANSWERS TO ODD-NUMBERED EXERCISES 


799 


sec. 17.4 
p. 753 


7 to 
9 to 


z+2 
_ az +b 


where a, b, c, d are all real and ad — be < 0 


“ —m 


x 3 — 3 xy 1 2 


2y 


1 to = — 2 a 3 to = \/~z 5 to = \/ Z 2 — 1 

7 In the half plane onto which the given region is mapped by the mapping 
function to = \/ z 2 — 1 + cosh -1 z, the temperature is 

r-ltan- 2 ^ - - 

tr X 2 ~ y 2 ~ 1 

This cannot be explicitly transformed back into an expression for the 
temperature in the original region in the to-plane. 
s + 1 


= iv + Yi In 


z - 1 


1 rx « 3.325; r 2 ,»\i = 1.338 + Q.632t 

3 r, = 3.732; r 2 = -2.618; n = -0.382; n = 0.268 

r, = 1.107; r, = -0.838; r 3 = r« = 0.500; r 0 - -0.270 

r, « 4.966; r 2 - 2.450; r 3 ,r., - 0.612 ± 1.129t; r» -0.640 


Appendix 
p. 764 


5 




Index 


The letter e. after a page number refers to an exercise, the letter n. to a footnote. 


Abel’s identity, 32 
Abscissa of convergence, 226 
Absolute convergence, of infinite series, 677 
of Laplace transforms, 229 
of series of matrices, 527 
Absolute value, of complex number, 641 
of vector, 418, 532 
Acceleration, normal, 548 
tangential, 548 
vector, 548 

Acceleration smoothing, 135, 142e. 
Addition, of complex numbers, 633 
of determinants, 408 
of infinite series, 678 
of matrices, 418 
of vectors, 533 
Adjoint matrix, 430 
Admittance, 169 
indicial, 273 
Admittance matrix, 417 
Advancing differences, 82 
Ampere’s law, 589 
Amplitude, of complex number, 637 
of forced vibrations, 155-157, 329 
of free vibrations, 306, 307 
of nth harmonic, 197 
.Amplitude envelope, 153, 162 
Amplitude modulation, 162, 165e. 
Amplitude spectrum, 213, 214 
Analytic functions, 653 
derivatives of. 653 
line integrals of, 667, 670 
mapping by, 732 
poles of, 699 
principal part at, 699 
properties of, 654, 655 
residues of, 700 


Analytic functions, residues of, practical 
computation of, 702 
series of, 683, 684 
singular points of, 653, 699 
Anharmonic ratio, 741 
Annulus, 646 
Antidifference, 87 
use in summing series, 88 
Argand diagram, 636 
Argument, of complex number, 637 
principal, 661 
principle of, 722 
Augmented matrix, 447 
Auxiliary equation, 38 


Beams, bending of, 56 
bending moment on, 56 
deflection curve of, 56 
end conditions for, 323, 327fi, 
free vibration of, 287, 323 
load per unit length on, 56, 288 
neutral axis of cross section of, 56 
neutral surface of, 56 
shear on, 56 
Beats, 162 

Ber and bei functions, 365 
Bernoulli numbers, 102 
Bernoulli’s equation, 21e. 

Bessel functions, asymptotic formulas for, 
357e. 

behavior of, at origin, 356, 359 
differentiation of, 365, 366 
equations solvable in terms of, 363 
expansions in series of, 375, 380, 385 
of first kind, 353 
modified, 358 



802 


INDEX 


Bessel functions, of first kind, modified, 
asymptotic formulas for, 363e. 
generating function for, 369 
integral formulas for, 371 
integration of, 368, 372e. 

Laplace transforms of, 377, 386e. 
of order ± \i, 364 
orthogonality of, 374 
recurrence relations for, 367 
of second kind, 354 
modified, 358 
of third kind, 354 
zeros of, 356, 359 

Bessel’s equation, equations reducible to, 

363 

modified, 357 
of order v, 351 
with a parameter, 351 
series solution of, 351 
Bessel’s inequality, 328e. 

Bilinear form, 469 
Bilinear transformation, 737 
Binomial coefficients, 83». 

Binomial expansion, 695 
Binormal, 550e, 

Boundary point, 646 

Boundary-value problems, summary of, 326 


Cantilever beam, end conditions for, 323 
vibration of, 323 
Casoratti’s determinant, 119a. 
Cauchy-Goursat theorem, 668 
Cauchy-Riemann equations, 652 
Cauchy’s equation, 61 n. 

Cauchy’s inequality, 544e., 644e. 

for |/(»>(* 0 )|, 673 
Cauchy’s integral formula, 669 
extensions of, 672 
Cauchy’s theorem, 667 
Cayley-Hamilton theorem, 517 
Center of gravity, 549 
Central differences, 82 
Channel, flow out of, 752 
Characteristic curves, 301e. 

Characteristic equation, of boundary-value 
problem, 326 

of difference equation, 120 
of differential equation, 38, 52 
of matric equation, 477, 487 
of system of differential equations, 75, 462 
Characteristic functions, 326 
orthogonality of, 320, 322 
Characteristic polynomial, 477 
Characteristic root, regular, 481 
Characteristic-value problem, for matrices, 
477, 487 

for partial differential equations, 461 


Characteristic values, 326, 477 
Characteristic vectors, 477 
orthogonality of, 484, 489 
Characteristics, method of, 295m, 301e. 
Christoffel symbols, of first kind, 629 
of second kind, 629 
Circle of convergence, 688 
use of, with series of real terms, 690 
Closed formula, 110 
Closed-loop system, 727 
Closed set, 646 
Closure, 318 

Coefficient of correlation, 143e. 

Cofactors, 401 
Columns, buckling of, 57 
critical loads for, 58 
Complementary functions, of difference 
equations, 119 
table of, 121 

of differential equations, 35 
table of, 40 

of systems of differential equations, 76, 
462 

Complete solution, of difference equations, 
119 

of differential equation, 5, 42 
of systems of algebraic equations, 448 
of systems of differential equations, 70, 76 
Completeness, 317 
Complex impedance, 169 
Complex inversion integral, 225, 711 
Complex number, 633 
absolute value of, 637, 641 
amplitude of, 637 
argument of, 637 
principal, 661 
components of, 634 
conjugate of, 634 
exponential form of, 658 
imaginary part of, 634 
logarithm of, 661 
modulus of, 637 
negative of, 634 
polar form of, 637 
powers of, 639, 661 
real part of, 634 
roots of, 639 

trigonometric form of, 637 
Complex numbers, addition of, 634 
division of, 634 
in polar form, 638 
equality of, 634 

geometrical representation of, 636 
inequalities for absolute values of, 641 
multiplication of, 634 
in polar form, 638 
subtraction of, 634 



Complex plane, 636 
integration in, 663-665 
Complex variable, 644 
analytic function of, 653 
exponential function of, 656 
function of, 644 
continuity of, 648 
geometrical representation of, 729 
regular point of, 653 
singular point of, 653, 699 
hyperbolic functions of, 660 
inverse, 662 

logarithmic function of, 660 
trigonometric functions of, 659 
inverse, 662 
Conductance, 169 
Conformal mapping, 722 
behavior, of angles under, 735 
of infinitesimal areas under, 734 
of infinitesimal lengths under, 734 
critical points of, 733 
Congruence transformation, 443 
Conjugate complex numbers, 634 
Connected set, 646 
Conservative field, 586 
Continuity equation of, 293, 555 
of function of complex variable, 648 
of sum of infinite series, 682 
Contour integral, 664 
Contravariant tensor, 620 
co variant derivative of, 631 
Contravariant vector, 603, 620 
Convergence, abscissa of, 226 
absolute, of infinite series, 677 
of Laplace transform integral, 229 
of series of matrices, 527 
circle of, 688 
conditional, 677 
of Fourier series, 194 
of improper integrals, 228 
of infinite series, 676 
ratio test for, 677 
of Laplace transforms, 229, 231 
in the mean, 318 
radius of, 689 
of series of matrices, 526 
uniform, 678 

of Laplace transform integral, 231 
Weierstrass M test for, 681 
Convergence factor, 224 
Convolution integral, 271 
Coordinates, cylindrical, 566 
generalized, 605 
normal, 503 
oblique, 597 
spherical, 389 
Corrector formula, 110 
Cosine transform, 236 


Covariant tensor, 620 
covariant derivative of, 631 
Covariant vector, 603, 620 
Cramer’s rule, 453 
Critical damping, 152 
Critical points of conformal transforma- 
tions, 733 

behavior of angles at, 734 
Cross produet, 534 
Cross ratio, 741 
invariance of, 741 
Crosscuts, 647 
Curl, 554, 556 
formulas for, 556 
in generalized coordinates, 627 
Curvature, 548 
radius of, 548 

Curve fitting, by factorial polynomials, 86 
by harmonic analysis, 206 
by Lagrange’s formula, 94 
by least squares, 135, 142e. 
by Newton’s divided-difference formula, 
91 

by orthogonal polynomials, 133 
Curve smoothing, 135 
Curves, sectionally smooth, 559 
simple closed, 568n. 

Cylindrical coordinates, 56 


D, 36 

D’Alembert’s solution of wave equation, 295 
Damped oscillation, 153 
Damping, critical, 152 
viscous, 147 
Damping ratio, 157 

relation of, to logarithmic decrement, 155 
Decibels, 155a. 

Definite integrals, for Bessel functions, 371 
differentiation of, 274w. 
evaluation of, by gamma functions, 239 
by residues, 704 

improper (see Improper integrals) 
Deformation of contours, principle of, 668 
Degrees of freedom, 145 
V, 552 
V 2 , 588 
A, 81 
3, 82 

3-function, 276 
de Moivre’s. theorem, 639 
Dependence, linear, 444 
Determinants, addition theorem for, 408 
definition of, 403 
diagonal dominance in, 455n. 
differentiation of, 414e. 
double subscript notation for, 401 
elements of, 401 



804 


INDEX 


Determinants, elements of, cofactors of, 
401 

minors of, 401 
expansion of, Laplace’s, 406 
by cofactors, 403 
Gramian, 460c. 

Jacobian, 609 
trith-order minors of, 401 
algebraic complements of, 401 
complementary, 401 
multiplication of, 411 
properties of, 407-411 
Difference equations, 118 
characteristic equation of, 120 
complementary functions of, 119 
table of, 121 
complete solution of, 119 
homogeneous, 118 
nonhomogeneous, 118 
order of, 118 

particular solutions of, 121 
table of, 122 
solution of, 118 
use of, 'in least squares, 143e. 
in summing series, 122 
Difference operators, 81 
Difference table, 80 
Differences, advancing, 82 
central, 82 
divided, 80 

Differential, exact, condition for, 

583 

total, 552 

Differential equation, 1 
Bernoulli’s, 21e. 

Bessel’s, 351 
modified, 357 
Cauchy’s, 61n. 

Euler’s, 61 

having a given general solution, 6 
of heat flow, 290 
Hermite’s, 399e. 

Laguerre’s, 399e. 

Laplace’s, 291 

in cylindrical coordinates, 294e., 382 
in generalized coordinates, 627 
in spherical coordinates, 389 
Legendre’s, 391 
associated, 390 

linear (see Linear differential equations) 
nonlinear, 2 
order of, 2 
ordinary, 2 
partial, 2 
Poisson’s, 588 
solution of, 1 
complete, 5 
general, 5 


Differential equation, solution of, singular, 
5 

of vibrating beam, 288 
of vibrating membrane, 285 
of vibrating shaft, 287 
of vibrating string, 284 
Differential equations, Cauchy-Riemann, 
652 

of electrical circuits, 150 
first-order, exact, 14 
existence theorem for, 5 
homogeneous, 11 
linear, 19 
separable, 8 
of higher order, 52 
Maxwell’s, 591 
of mechanical systems, 150 
numerical solution of, 108 
Adams-Bashforth method, 116e. 
Adams-Moulton method, 117e. 

Euler’s method, 112 
Kutta’s third-order approximation, 112 
Milne’s method, 108 
modified Euler method, 112 
Runge-Kutta method, 1 14 
Runge’s method, 112 
of second order, homogeneous, 30 
nonhomogeneous, 30 
series solution of, 347 
simultaneous (see Simultaneous differ- 
ential equations) 

solvable in terms of Bessel functions, 363 
of transmission line, 292 
Differentiation, of analytic function, 653 
of Bessel functions, 365, 366, 372e. 
of definite integral, 274n. 
of determinants, 414e. 
of Fourier series, 195 
of improper integrals, 229 
of infinite series, 684, 685 
of Laplace transforms, 251 
of matrices, 428e. 
numerical, 99, 136 
of vector functions, 545 
Dimension of set of vectors, 456 
Directional derivative, 551 
Dirichlet conditions, 185 
Dirichlet’s theorem, 185 
j Distributions, theory of, 278n. 

Divergence, 554 
formulas for, 556 
in generalized coordinates, 624 
Divergence theorem, 572-574 
Divided differences, 80, 90e. 

Domain, 646 
Dot product, 418, 534 
Doublet function, 277 
Duhamel’s formulas, 274 




INDEX 


805 


Dummy index, 610 
Dyad, 535 n. 


E, 82 

Eigenfunction, 326 
Eigenvalue, 326 
Eigenvector, 478n. 

Einstein summation convention, 610 
Elastance, 165 
Elasticity matrix, 435 
Electrical circuits, differential equations of, 
150 

forced vibrations of, 167 
free vibrations of, 166, 174 
laws of, 148, 149 
Electrostatic field, 585 
Elementary functions of complex variables, 
656 

Energy method, 59 
Equivalence transformation, 443 
Equivalent equations, 449n. 

Equivalent matrices, 438 
Error function, 343 
Error signal, 727 
Essential singularity, 699 
Euler-Maclaurin summation formula, 101 
Euler’s equation, 61 

Euler’s formulas, for cos 0 and sin 0, 659 
for Fourier coefficients, 182 
Euler’s theorem on homogeneous functions, 
14e. 

Even function, 189 
Fourier expansion of, 190 
Fourier integral of, 217 
Exact differential equation, 15 
Exponential function, 189 
Exponential order, 226 
Exterior point, 646 


Factorial polynomials, 84 
expansions in terms of, 85 
Falling bodies, 27e. 

Faltung integral, 271 
Faraday’s law, 589 
Feedback loop, 726 
Feedback signal, 727 
Field, conservative, 5S6 
electrostatic, 585 
gravitational, 585 
magnetic, 5S5 
Field intensity, 586 
Finite differences, 79 
First-order reaction, 26 
Forced motion, 151 


Forced vibrations, of electric circuits, 68 
of mass-spring systems, 156, 161, 497 
magnification ratio for, 157 
phase angle for, 158 
Fourier integrals, 211 

approximation by, when upper limit is 
finite, 219 

for even functions, 217 
exponential form of, 215 
initial conditions fitted by, 332 
as limit of Fourier series, 211 
for odd functions, 217 
relation to Laplace transforms, 222 
transform-pair forms of, 215, 217 
trigonometric form of, 216 
Fourier series, 181 
alternative forms of, 196 
amplitude spectrum of, 213, 214 
coefficients of, 184 
behavior of, for large n, 194 
complex form of, 197 
convergence of, 194 
differentiation of, 195 
half-range, 192 
of even functions, 190, 196e. 
of odd functions, 191, 196e. 
initial conditions fitted by, 306, 308, 309 
integration of, 195 
nth harmonic of, 197 
periodic excitations represented by, 201, 
204 

plots of partial sums of, 187 
Fourier transform pair, for even function, 
217 

for odd function, 217 
unilateral, 222 

Fourier transforms, 221c., 222e. 

Fourier’s law of heat conduction, 27e. 

Free motion, 151 
critically damped, 152 
overdamped, 151 
underdamped, 153 

Free vibrations, amplitude of, 306, 308, 328 
of beams, 287, 323 
of electric circuits, 166, 174 
of mass-spring systems, 151 
of shafts, 302 
of strings, 298, 379 
Frequency, effect of friction on, 154 
natural, of cantilever beams, 324 
of LC circuits, 166, 174 
of mass-spring systems, 154, 499 
of shafts, 306, 308, 309 
of strings, 299, 310e. 

Frequency equation, 326 
determinantal, 499 
Frequency ratio, 157 




INDEX 


Friction, coefficient of, 164e. 

Coulomb, 164e. 

effect of, on frequency, 154 

viscous, 146 

Frobenius, method of, 347 
Function, 5, 276 
entire, 691 
error, 343 
even, 189 
exponential, 657 
filter, 248 
gamma, 238 

generalized factorial, 238 
harmonic, 654 
holomorphic, 653 
homogeneous, 12 
impulse, 276 
integral, 691 

logarithmic, of complex variable, 660 
principal value of, 661 
Morse dot, 266 
odd, 190 
periodic, 182n. 
potential, 586 
rms value of, 200 
sine integral, 218 
staircase, 262 
transfer, 273, 727 
unit doublet, 277 
unit step, 237 

Functions, analytic (see Analytic functions) 
ber and bei, 365 
Bessel (see Bessel functions) 
characteristic, 326 
complementary (see Complementary 
functions) 

conjugate harmonic, 654, 656e. 
elementary, of complex variables, 656 
of exponential order, 226 
Hankel, 354 

hyperbolic, of complex variables, 660 
inverse, 662 
ker and kei, 362 
Legendre, of second kind, 391 
orthogonal (see Orthogonal functions) 
orthonormal, 315 
regular, 653 
piecewise, 226 
singularity, 277, 280e. 
translated and “cut off,” 247 
trigonometric, of complex variables, 659 
inverse, 662 

vector (see Vector functions) 
Fundamental metric tensor, 621 


Gamma function, 238 
graph of, 239 


Gauss’ law, for electric fields, 589 
for magnetic fields, 589 
Gauss’ reduction, 449 
Gauss’ theorem, 577, 585e. 

Generalized coordinates, 605 
curl in, 627 

differential of arc in, 606 
divergence in, 627 
Laplacian in, 626 
length of vector in, 607 
local base vectors in, 606 
local reciprocal base vectors in, 606 
parametric curves in, 605 
transformations of, 609 
Generalized functions, 278 
Generalized orthogonality, 470 
Generating function, for Bessel functions, 
370 

for Legendre polynomials, 394 
Gradient, 551 

geometrical properties of, 551, 552 
Graeffe’s root-squaring process, 755 
Gram determinant, 460e. 

Gramian, 460e. 

Gravitational field, 585 
Gravitational potential, 588 
Green’s lemma, 567-570 
Green’s theorem, 575 
Gregory-Newton formula, backward, 95 
forward, 94 

Gregory’s formula of numerical integration, 
104 


Half-range Fourier series, 189 
Hankel functions, 354 
Harmonic analysis, 206 
Harmonic functions, 654 
conjugate, 654, 656e. 

Harmonics, 197 
higher, resonance with, 202 
spherical, 389 
surface, 390 
zonal, 392 
Heat equation, 290 
solution of, 311, 333, 382, 397 
uniqueness of solutions of, 593 
Heat flow, in cooling fins, 381, 387e. 
in cylinders, 382 
differential equation of, 290 
uniqueness of solutions of, 593, 594e. 
laws of, 27e., 288, 312 
in spheres, 397 
in thin rods, 311, 331 
in thin sheets, 333, 744 
Heaviside’s expansion theorems, 255 
Hermite polynomials, 399e. 

Hermite’s equation, 399e. 



INDEX 


807 


Hermitian form, 469 
Holomorphie function, 653 
Homogeneous differential equations, first- 
order, 11 

higher-order, 30, 52 
simultaneous, 74, 461 
Homogeneous functions, 12 
Euler’s theorem on, 14e. 

Hyperbolic functions of complex variables, 
660, 662 


Impedance, electrical, 167 
complex, 169 

parallel combinations of, 169 
series combinations of, 169 
mechanical, 167 

Improper integrals, continuity of, 228 
convergence of, 228 
differentiation of, 229 
integration of, 229 
principal value of, 706n. 

Impulse function, 276 
Inconsistent equations, 450 
Independence, linear, 444 
Indicial admittance, 273 
Indicial equation, 348 
Inequalities, Cauchy’s, 544e., 644e. 
for |/< n >(zo)|, 673 
for complex numbers, 641 
for line integrals, 665 
Infinite series (see Series) 

Inner product, of tensors, 622 
of vectors, 418, 534 
Integral, complex inversion, 225, 711 
contour, 664 
convolution, 271 
of J e (x), 369m, 371e. 
running, 105 
surface, 565 
volume, 566 

(See also Definite integrals, Fourier in- 
tegrals, Improper integrals, Line 
integrals, Particular integrals) 
Integrating factor, 17, 20 
Integration, of Bessel functions, 368, 372e. 
of Fourier series, 195 
of improper integrals, 229 
of infinite series, 683 
of Laplace transforms, 252 
line, 560 

in complex plane, 664 
numerical, 104 

of differential equations, 108, 116e., 
117e. 

Integrodifferential equation, 148 
Interior point, 646 


Interpolation formulas, Gregory-Newton, 
backward, 95 
forward, 94 
Lagrange’s, 94 
Laplace-Everett, 97 
Newton-Gauss, backward, 97 
forward, 96 

Newton’s divided difference, 91 
Stirling’s, 97 
Inversion, 738 


Jq(x), integral of, 869m, 37 le. 
Jacobian, 609 

of conformal transformation, 733 
Jacobian determinant, 609 
Jacobian matrix, 609 
Jordan canonical form, 497 


Ker and Kei functions, 362 
Kernel of transform, 236 
Kirchhoff’s first law, 149 
Iiirehhoff’s second law, 148 


£, 227 
£~\ 227 

Lagrange’s identity, 543 
Lagrange’s interpolation formula, 94 
Lagrange’s reduction, 470 
Laguerre polynomials, 399e. 

Laguerre’s equation, 399e. 

Lambert’s law, 25e. 

Laplace-Everett interpolation formula, 97 
Laplace transform pair, 225 
Laplace transforms, of Bessel functions, 
377, 386c. 

convergence of, absolute, 229 
uniform, 231 
of derivatives, 234 
differentiation of, 251 
of elementary functions, 237 
Heaviside’s theorems on, 255 
of integrals, 234 
integration of, 252 
inversion integral for, 225, 711 
limit theorems for, 242, 243, 254c. 
of periodic functions, 260 
tables for, 266, 267 
products of, 270 
of products containing e~ at , 245 
relation to Fourier integrals, 227 
of singularity functions, 278, 280e. 
solution, of differential equations by, 235 
of partial differential equations by, 338 
of translated functions, 248 



INDEX 


Laplace’s equation, 291 
in cylindrical coordinates, 294e., 382 
in generalized coordinates, 626 
invariance under conformal transforma- 
tion, 735 

relation to analytic functions, 654, 669 
in spherical coordinates, 389 
Laurent’s expansion, 692 
uniqueness of, 695 
Least squares, 126 
acceleration smoothing by, 136 
curve smoothing by, 135, 142e. 
dangers in logarithmic transformations 
in, 139 

relation to orthogonal functions, 131 
use in, of difference equations, 143e. 
of orthogonal polynomials, 130 
of Taylor series, 137 
velocity smoothing by, 135, 142e. 
Legendre functions, 391 
Legendre polynomials, 388 
algebraic form of, 392 
generating function for, 394 
orthogonality of, 397, 399e. 

Rodrigues’ formula for, 392 
series of, 398 

trigonometric form of, 395 
Legendre’s equation, 391 
associated, 390 
algebraic form of, 391 
Leibnitz’ rule, 274n. 

Lerch’s theorem, 244n, 

Lever surface, 552 

Limit of function of complex variable, 648 
Limit point, 646 

Line integrals, in complex plane, 664 
conditions for independence of path, 
579-582 

inequalities for, 665 
real, 560 

geometrical interpretation of, 562 
of vector functions, 560 
Linear combination, 445 
Linear dependence, 444 
Linear differential equations, 2 
complementary function of, 35 
complete solution of, 33 
with constant coefficients, 35 
auxiliary equation of, 38 
characteristic equation of, 38 
higher order, 30, 52 
homogeneous, 30, 36 
nonhomogeneous, 30, 42 
particular integrals of, by Laplace 
transforms, 272 
exponents of, 348 
finding second solution of, 33 
first-order, 19 


Linear differential equations, indieial equa- 
tion of, 348 
ordinary point of, 345 
particular integral of, 35 
by variation of parameters, 49 
series solution of, 347 
singular point of, 345 
irregular, 346 
regular, 346 

Linear equations, systems of, 447 
augmented matrix of, 447 
coefficient matrix of, 447 
complete solution of, 448 
equivalent, 449 n. 

Gauss reduction for, 449 
homogeneous, 447 
inconsistent, 450 
nonhomogeneous, 447 
trivial solution of, 454 
Linear fractional transformation, 737 
Linear independence, 444 
Linear transformation, 425 
matrix of, 425 
Liouville’s theorem, 691 
Load per unit length, 56, 288 
relation to shear, 57 
Logarithmic decrement, 155 
relation to damping ratio, 155 
Logarithmic function, 660 
principal value of, 661 
Lumped parameters, 145 


M test, 681 
Magnetic field, 585 
Magnification ratio, 157 
Mapping, 729 
conformal, 732 
isogonal, 735 

Matric differential equations, 461 
Matric equations, 447 
characteristic equation of, 477, 487 
characteristic values of, 477, 487 
reality of, 482 

characteristic vectors of, 477, 487 
independence of, 484, 486, 490 
normalized, 491 
orthogonality of, 484, 489 
solution of, 513 
Matrices, addition of, 418 
conformable, 419 
conformably partitioned, 424 
diagonalization of, 493, 495 
elementary transformations of, 437 
equal, 415 

equivalent to diagonal matrix, 493 
multiplication of, 420, 422 



INDEX 


809 


Matrices, series of, 525 
similar, 443 

characteristic polynomials of, 479 
similar to diagonal matrix, 495 
subtraction of, 418 
transformations of, 492 
congruence, 443 
equivalence, 443 
orthogonal, 443 
similarity, 443 
unitary, 443 

transpose of products of, 423 
Matrix, 415 
adjoint of, 430 
admittance, 417 
associate of, 416 
augmented, 447 
characteristic equation of, 477 
characteristic polynomial of, 477 
characteristic values of, 477 
multiple, 481 
reality of, 481 
regular, 481 

characteristic vectors of, 477 
independence of, 480, 484 
orthogonality of, 484 
coefficient, 447 
column, 415 
conjugate of, 416 
derivative of, 428e. 
diagonal, 415 
elasticity, 435 
Hermitian, 416 
imaginary, 416 
inverse of, 430 
Jacobian, 609n. 
lower triangular, 415 
minimum polynomial of, 521 
minors of, 416 
principal, 416 
modal, 487 
nonsingular, 429 
nth power of, 506 
null, 416 
orthogonal, 435 
rank of, 437 
column, 456 
determinant, 437 
row, 456 

Sylvester’s law of nullity for, 457 
real, 416 
row, 415 
scalar, 428e. 
singular, 429 
skew-Hermitian, 416 
skew-symmetric, 416 
square, 415 

determinant of, 415 


Matrix, square, functions of, 505 
square roots of, 514 
trace of, 479 
(See also Square matrix) 
stiffness, 434 
symmetric, 416 
transpose of, 415 
unit, 416 
unitary, 435 
zero, 416 

Maxima and minima of functions of several 
variables, 474 
Maxwell’s equations, 591 
Mechanical impedance, 167 
Milne’s method, 108 
Minimum polynomial, 521 
Minors, 401, 416 
Mobius transformation, 737 
Modal matrix, 487 

Modified Bessel functions, of first kind, 358 
of second kind, 358 
Modulus, of complex number, 637 
of elasticity, 294 
of spring, 59 
Moment, bending, 56 
of force about a point, 545c. 
vector, 545e. 

Motion, critically damped, 152 
forced, 151 
free, 151 
overdamped, 151 
steady-state, 160 
transient, 159 
underdamped, 153 
Multiply-connected set, 646 


Natural frequency, 154 
Neighborhood, 646 
Nepers, 155 
Neutral axis, 56 
Neutral surface, 56 

Newton-Gauss interpolation formula, back- 
ward, 97 
forward, 96 

Newton’s divided-difference formula, 91 
remainder term in, 92 
Newton’s law of cooling, 27e. 

Newton’s second law of motion, 27e., 146 
in torsional form, 26e., 146 
Normal acceleration, 548 
Normal coordinates, 503 
Normal equations, 129 
Normal modes, 326, 501 
Null function, 316 

Numerical methods, of differentiation, 99, 
136 

of harmonic analysis, 206 



810 


INDEX 


Numerical methods, of integration, 103, 
1076. 

of solving differential equations, 108 
of solving equations, 314, 324, 735 
Nyquist stability criterion, 726 

Oblique coordinates, 597 
metrical properties of space in, 598, 601 
reciprocal base vectors in, 599 
reference vectors in, 597 
length of, 599 
Odd function, 190 
Fourier expansion of, 191 
Fourier integral for, 217 
Ohm’s law, 167 
Open formula, 110 
Open-loop system, 726 
Open set, 646 

Operational calculus (see Laplace trans- 
forms) 

Operators, D, 36 
V, 552 
V 2 , 291, 558 
A, 81 
8, 82 
E, 82 

equivalent, 83 
£, 227 
J3~ l , 227 

Order, of difference equation, 188 
of differential equation, 2 
Ordinary point, 345 
Orthogonal functions, 315 
closure of, 318 
completeness of, 317 
expansions in series of, 316 
least-square approximation by, 328e. 
Orthogonal polynomials, 130 
use of, in least squares, 131 
in smoothing of data, 135, 142e. 
Orthogonal trajectories, 28e., 655 
Orthogonal transformations, 443 
Orthogonal vectors, 419, 470, 534 
Orthogonality with respect to, symmetric 
matrix, 470-/ 
weight function, 316 
Orthonormal functions, 315 
Orthonormal vectors, 458 
Osculating plane, 548, 550e. 

Partial differential equations, 2, 283 
elliptic, 301e. 
hyperbolic, 301e. 
parabolic, 301e. 

solution of, by D’Alembert method, 294 
by Laplace transforms, 338 
by separation of variables, 302 


Particular integrals, 35 
by Laplace transforms, 272 
for simultaneous differential equations, 
76, 465 

by undetermined coefficients, 43 
table of, 46 

by variation of parameters, 49 
Period, 182n. 

Periodic function, 182?j. 

Phase angle, 158 
Poisson’s equation, 588 
Poisson’s formula, 675e. 

Polar, 470 
Pole, 699 
order of, 699 

principal part of f(z) at, 699 
Polynomials, factorial, 84 
finding zeros of, 755 
interpolation, 91, 94 
orthogonal, 130 
Potential, 586 
gravitational, 588 
Predictor formula, 110 
Principle of argument, 722 
Probability integral, 343n. 

Products, of complex numbers, 634, 638 
cross, 534 

of determinants, 411 
dot, 418, 534 
inner, 418 

of Laplace transforms, 270 
of matrices, 420, 422 
scalar, 418, 534 
scalar triple, 538 
of series, 678 
vector, 534 
vector triple, 541 

Quadratic form, '466 
indefinite, 467 
kinetic energy, 503 
matrix of, 467 
negative, 466 
negative-definite, 466 
conditions for, 468 
nonsingular, 467 

polar of point with respect to, 470 
positive, 466 
positive-definite, 466 
conditions for, 468 
potential energy, 476e., 503 
reduction of, to sum of squares, 471 
semidefinite, 466 
singular, 467 
Quality factor, 171e. 


INDEX 


81 1 


Radius, of convergence, 347, 6S9 
of curvature, 548 
Rank, 437 
column, 456 
determinant, 437 
row, 456 

of tensor, 620, 621 
Ratio test, 676 
Reactance, 169 
Region, 646 
boundary point of, 646 
bounded, 646 
closed, 646 
exterior point of, 646 
interior point of, 646 
multiply connected, 582 
open, 646 

simply connected, 582 
unbounded, 646 
Regular function, 653 
piecewise, 226 
Regular point, 653 
Residue theorem, 701 
Residues, 700 
calculation of, 702 

evaluation of definite integrals by, 704- 
708 

Resonance, 162 
with higher harmonics, 202 
Rodrigues’ formula, 392 
Root-coefficient relations, 719, 757 
Routh-Hurwitz stability criterion, 721 
Running integral, 105 


Scalar, 418, 532 
Scalar product, 418, 534 
Scalar triple product, 538 
Schmidt orthogonalization process, 458, 470 
Schwarz-Christoffel transformation, 748 
Self-con jugate form of equation of circle, 
636e., 738 

Separable differential equation, 8 
Separation of variables, in first-order differ- 
ential equations, 8 
in partial differential equations, 302 
Sequence of matrices, convergence of, 525 
divergent, 525 
Series, addition of, 678 
of Bessel functions, 375, 380, 385 
convergence of ( see Convergence) 
differentiation of, 684, 685 
Fourier (see Fourier series) 
integration of, 683 
Laurent’s, 692 

of Legendre polynomials, 398 
Maclaurin’s, 689 
of matrices, 528 


Series, multiplication of, 67S 
of orthogonal functions, 318 
partial sums of, 676 
power, 689 

rearrangement of terms of, 677 
region of convergence of, 676 
remainder after n terms of, 676 
sum of, 676 

continuity of, 682 
summation of finite, 88, 89e. 

Taylor’s, 687 

circle of convergence of, 688 
radius of convergence of, 689 
Set of vectors, dimension of, 456 
orthonormal, 458 

Schmidt orthogonalization process for, 
458, 470 

Shaft vibrations, longitudinal, 293e. 

torsional, 302 
Shear, 56 

relation of, to bending moment, 57 
Similarity transformation, 443 
Simple closed curve, 568n. 

Simply connected set, 646 
Simpson’s rule, 107e. 

Simultaneous algebraic equations, 447 
consistency of, 450 
Gauss’ reduction for, 449 
homogeneous, 447 

condition for nontrivial solution of, 

454 

solution vectors of, 418, 447 
nonhomogeneous, 447 
augmented matrix of, 447 
solution of, by Cramer’s rule, 453 
Simultaneous differential equations, 66, 461 
characteristic equation of, 75, 462 
complementary function for, 76, 463 
complete solution of, 76, 463 
in matrix form, 461 
particular integral of, 76, 465 
reduction to single equation, 67 
Sine integral function, 218 
Sine transform, 236 

Singular point, of analytic function, 653 
essential, 699 
isolated, 699 

of differential equation, 346 
irregular, 346 
regular, 346 
residue at, 700 
Singularity functions, 277 
Specific heat, 189 
Spherical coordinates, 389 

differential of arc length in, 615 
Spherical harmonic, 389 
Spring modulus, 59 



812 


INDEX 


Square matrix, 415 
characteristic equation of, 477 
characteristic values of, 477 
characteristic vectors of, 477 
integral powers of, 505 
minimum polynomial of, 521 
polynomial annihilate of, 521 
polynomial equations in, 513 
polynomial functions of, 506 
characteristic vectors of, 510 
power series in, 528 
rational functions of, 508 
characteristic values of, 510 
trace of, 479 
(See also Matrix, square) 

Stability criteria, 716 
for cubic equations, 719 
Nyquist, 726 
Routh-Hunvitz, 721 
Static deflection, 157 
Steady state, 160 
Stefan’s law, 312 
Stiffness matrix, 434 
Stirling’s interpolation formula, 97 
Stoke’s theorem, 577-579 
Stream lines from channel, 752 
String, forced vibrations of, 329 
traveling waves on, 296, 298, 339 
vibration of, 283, 379 
Sturm-Liouville theorem, 320 
extension to fourth-order systems, 322 
Submatrices, 416 

Summation of finite series, 88, 89e, 

Surface harmonic, 390 
Susceptance, 169 

Systems, with one degree of freedom, 145 
with several degrees of freedom, 145, 171 
use of difference equations in, 123, 174 


Tangential acceleration, 548 
Taylor’s series, 687 
use of, in least squares, 137 
Taylor’s theorem, 686 
Telegraph equations, 292 
Telephone equations, 292 
Tensor, alternating, 621 
or arbitrary rank, 621 
components of, 620 
contraction of, 622 
contra variant, of rank 1, 620 
of rank 2, 620 
co variant, of rank 1, 620 
of rank 2, 620 

covariant derivative of, 631, 632 
fundamental metric, 621 
mixed, of rank 2, 621 
of rank zero, 620 


Tensor, skew-symmetric, 621 
symmetric, 621 
Tensors, equal, 621 
inner product of, 622 
outer product of, 621 
quotient law for, 623 
sum of, 621 

Theorem, binomial, 695 
Cauchy-Goursat, 668 
Cauchy’s, 667 
de Moivre’s, 639 
divergence, 572 
Euler’s, 14 
Gauss’, 577, 585e. 

Green’s, 575 
Heaviside’s, 255 
Lerch’s, 244n. 

Liouville’s, 691 
maximum modulus, 674 
Morera’s, 672 
Parseval’s, 318 
residue, 701 
Stoke’s, 577-579 
Sturm-Liouville, 320 
extended, 622 
Taylor’s, 686 

Theory of distributions, 278n. 

Thermal conductivity, 289 
Torricelli’s law, 23 
Torsion of space curve, 550e. 

Torsional rigidity, 286 
Total differential, 14, 552 
Trajectories, orthogonal, 28e., 655 
Transfer function, 273, 727 
Transformations of matrices, 443 
Transforms, cosine, 236e. 

Fourier, 221e. 

Laplace ( see Laplace transforms) 
sine, 236e. 

(See also Fourier transform pair) 
Transient, 159 

Transition probabilities, matrix of, 426 
Transmission line, equations for, 291 
steady-state behavior of, 332 
transient behavior of, 342 
Trapezoidal rule, 105 
use of, in running integration, 105 
Trigonometric functions of complex vari- 
ables, 659, 662 
Trivial solution, 454 


Umbral index, 610 

Undetermined coefficients, method of, 43, 
121 

Uniform convergence, of infinite integrals, 
228 

of infinite series, 678 


INDEX 


813 


Uniform convergence, of Laplace transform 
integral, 231 

Unilateral Fourier transform pair, 222 

Unit doublet, 277 

Unit impulse, 276 

Unit step function, 237 

Unit triplet, 278 

Unit vectors, 419, 533 

Unitary transformations, 443 


Variation of parameters, 49 
Vector, 417, 532 
absolute value of, 418, 532 
components of, 418 
curl of, 554 
divergence of, 554 
length of, 418 
generalized, 470 
negative of, 533 
product of scalar and, 533 
unit, 419, 533 
zero, 533 

Vector acceleration, 548 
Vector angular velocity, 555 
Vector functions, 545 
derivative of, 545 
differential of, 545 
line integral of, 560 
surface integral of, 565 
Vector moment, 545e. 

Vector product, 534 
Vector triple product, 541 
Vector velocity, 548 
Vectors, addition of, 533 
characteristic, of square matrix, 478 
contravariant representation of, 603 
covariant representation of, 603 
cross product of, 534 


Vectors, difference of, 533 
dot product of, 418, 534 
equal, 533 

inner product of, 41S, 534 
normalized, 470 
orthogonal, 419 
orthonormal, 458 
reciprocal sets of, 544c., 599 
in same direction, 419 
scalar product of, 534 
solution, 418, 447, 477 
orthogonality of, 484, 488, 500 
Velocity smoothing, 134, 142c. 

Vibrations, amplitude modulation of, 162 
of beams, 287, 323 
damped, 153 

of electric circuits, 165, 174 
forced (see Forced vibrations) 
free ( see Free vibrations) 
of membranes, 284 
normal modes of, 326, 501 
of shafts, 285, 302 
of strings, 282, 298, 329, 379 
Volume integral, 566 


Wave equation, one-dimensional, 284 
D’Alembert solution of, 295 
two-dimensional, 285 
Weierstrass M test, 681 
Work, 563 
Wronskian, 31, 52n. 


Zeros, of Bessel functions, 353, 356, 357e., 
359 

within given contour, 722 
Zonal harmonic, 392 



